/*
TODO:
1. The ability to rotate the figure 90 degrees so that the box plots were verticle 
   and the data and distribution were to their sides with the lables at top.
2. The ability to manipulate the box plot: change outline, tails, etc.
3. The ability to manipulate the distribution: format outline, insert mean and quartile lines, etc.

*/

/*** HELP START ***//*
 
## >>> `%RainCloudPlot()` macro: <<< <a name="raincloudplot-macro"></a> #######################  

The RainCloudPlot() macro allow to plot Rain Cloud plots, i.e. 
plots of kernel density estimates, jitter data values, and box-and-whiskers plot.

See examples below for the details.

The "Here Comes the Rain (Cloud Plot) Again" (B. Jablonski, 2024) article
describing the macro is attached as an additional content to the package.

### SYNTAX: ###################################################################

The basic syntax is the following, the `<...>` means optional parameters:
~~~~~~~~~~~~~~~~~~~~~~~sas
%RainCloudPlot(
   DS
  ,gr
  ,vars

 <,WidthPX=>
 <,HeightPX=>
 <,boxPlot=>
 <,roundFactor=>
 <,rainDropSize=>
 <,boxPlotSymbolSize=>
 <,boxPlotLineSize=>
 <,boxPlotFill=>
 <,whiskerScale=>
 <,meanShiftLine=>
 <,meanShiftStep=>
 <,meanShiftColors=>
 <,colorsList=>
 <,monochrome=>
 <,antialiasMax=>
 <,title=>
 <,footnote=>
 <,catLabels=>
 <,xLabels=>
 <,xBothAxis=>
 <,catLabelPos=> 
 <,xLabelPos=>
 <,catLabelAttrs=>
 <,xLabelAttrs=>
 <,formated=>
 <,y2axis=>
 <,y2axisLevels=>
 <,y2axisValueAttrs=>
 <,y2axisFormat=>
 <,y2axisLines=>
 <,catAxisValueAttrs=>
 <,xaxisValueAttrs=>
 <,xaxisTickstyle=>
 <,xaxisValues=>
 <,xaxisValuesDisplay=>
 <,xaxisValuesFormat=>
 <,xaxisValuesRotate=>
 <,xaxisOther=>
 <,sganno=>
 <,odsGraphicsOptions=>
 <,sgPlotOptions=>
 <,vertical=>

 <,VSCALE=>
 <,KERNEL_K=>
 <,KERNEL_C=>
 <,VSCALEmax=>
 <,minRange=>
 <,maxRange=>

 <,cleanTempData=>
 <,codePreview=>
 <,reuseN=>
)
~~~~~~~~~~~~~~~~~~~~~~~

**Arguments description**:

1. `DS`                - *Required*, name of the dataset from 
                         which variables are to be taken.

2. `gr`                - *Required*, name of the grouping variable. 
                         When more than one variable is specified 
                         separate plots are rendered.
                         Can be numeric or character.

3. `vars`              - *Required*, name of the aggregated numeric variable.
                         When more than one variable is specified 
                         separate plots are rendered.

***Plot related options***:

* `WidthPX`            - *Optional*, default value `1200`. 
                         Total width of the plot in pixels. 

* `HeightPX`           - *Optional*, default value `220`. 
                         Partial height of the plot in pixels.
                         Total height is calculated as `#GROUPS x HeightPX`.

* `boxPlot`            - *Optional*, default value `1`.
                         Indicates if the Box Plot should be added. 
                          
* `roundFactor`        - *Optional*, default value `0.000001`.
                         Rounding level when calculating maximum value 
                         of the cloud chart. Should be adjusted to data
                         granularity level, e.g. for data with value 
                         around `1e-8` should be decreased. 

* `rainDropSize`       - *Optional*, default value `5px`.
                         Size of data points in the "rain" plot.

* `boxPlotSymbolSize`  - *Optional*, default value `8px`.
                         Size of symbols on the box plot.
                         If two values are provided, e.g., `16px 8px`,
                         the first is used for diamond size (the mean), 
                         the second for "min/max" bars.

* `boxPlotLineSize`    - *Optional*, default value `1px`.
                         Thickness of lines of the box plot.

* `boxPlotFill`        - *Optional*, default value `1`.
                         Transparency of the box plot.
                         Ranges from 0.0 (opaque) to 1.0 (full translucent).

* `whiskerScale=`      - *Optional*, default value `1.5`.                         
                         It provides `WS` parameter in the `Q1 - WS*IQR`
                         formula and the `Q3 + WS*IQR` formula.
                         Provided value should be a positive number. 
                         Otherwise it is set to 0.

* `meanShiftLine`      - *Optional*, default value `0`.
                         Indicates if a line connecting mean symbol 
                         on the Box Plot should be added.

* `meanShiftStep`      - *Optional*, default value `0.1`.
                         Sets how smooth gradient should be
                         on the "mean-shift" line.

* `meanShiftColors`    - *Optional*, default value is empty.
                         List of colours for plotting the "mean-shift" line.
                         Empty indicates that the `colorsList` value will be used.

* `colorsList`         - *Optional*, default value is empty.
                         List of colours for plotting.
                         Empty indicates that the default list will be used.

* `monochrome`         - *Optional*, default value `0`.
                         Indicates if the default list of colours should be gray-scale.

* `antialiasMax`       - *Optional*, default value is empty.
                         Sets a value to the ODS graphics `ANTIALIASMAX` option.
                         When empty the value is calculated from data.

* `title`              - *Optional*, default value - see notes below.
                         Provides a list of titles printed on the plot.
                         For details see notes below.

* `footnote`           - *Optional*, default value - see notes below.
                         Provides a list of titles printed on the plot.
                         For details see notes below.

* `catLabels`          - *Optional*, default value is empty. 
                         List of values for group axis labels (vertical).
                         When empty a grouping variable name is used.
                         For details see notes below.

* `xLabels`            - *Optional*, default value is empty. 
                         List of values for data variable axis labels (horizontal).
                         When empty a data variable name is used.
                         For details see notes below.

* `xBothAxis`          - *Optional*, default value is `1`. 
                         Indicates if both (top and bottom) axis (horizontal) should be printed.
                         If not `1` then only bottom axis is displayed.

* `catLabelPos`        - *Optional*, default value `DATACENTER`.
                         Indicates position of the label on group axis (vertical).
                         Allowed values are `BOTTOM`, `CENTER`, `DATACENTER`, and `TOP`.

* `xLabelPos`          - *Optional*, default value `DATACENTER`.
                         Indicates position of the label on data axis (horizontal).
                         Allowed values are `LEFT`, `CENTER`, `DATACENTER`, and `RIGHT`.

* `catLabelAttrs`      - *Optional*, default value is empty. 
                         List of attributes for group axis labels (vertical).
                         For details see notes below.

* `xLabelAttrs`        - *Optional*, default value is empty. 
                         List of attributes for data variable axis labels (horizontal).
                         For details see notes below.

* `formated`           - *Optional*, default value `0`.
                         Indicates if values of the grouping variable should be formatted.

* `y2axis`             - *Optional*, default value `1`.
                         Indicates if the right vertical axis should be displayed.

* `y2axisLevels`       - *Optional*, default value `4`.
                         Sets the number of expected levels of values printed 
                         on the right vertical axis.

* `y2axisValueAttrs`   - *Optional*, default value `Color=Grey`.
                         Allows to modify Y2 axis values attributes.

* `y2axisFormat`       - *Optional*, default value `12.2-L`.
                         Allows to modify Y2 axis values format.

* `y2axisLines`        - *Optional*, default value `0`.
                         If `1`, adds horizontal lines for Y2 axis values.

* `catAxisValueAttrs`  - *Optional*, default value `Color=Black`.
                         Allows to modify category (Y) axis values attributes.

* `xaxisValueAttrs`    - *Optional*, default value `Color=Grey`.
                         Allows to modify X axis values attributes.

* `xaxisTickstyle`     - *Optional*, default value `INSIDE`.
                         Allows to modify X axis tick style. 
                         Allowed values are `OUTSIDE`, `INSIDE`, `ACROSS`, and `INBETWEEN`. 
                         *For SAS previous to* **9.4M5** *set to missing!*

* `xaxisValues`        - *Optional*, default value is empty.
                         It is a wrapper to provide value for 
                         the `XAXIS` statement, for `Values` option.

* `xaxisValuesDisplay` - *Optional*, default value is empty.
                         It is a wrapper to provide value for 
                         the `XAXIS` statement, for `ValuesDisplay` option.

* `xaxisValuesFormat`  - *Optional*, default value is empty.
                         It is a wrapper to provide value for 
                         the `XAXIS` statement, for `ValuesFormat` option.
                         Instead using `w.d` format, use its alias `Fw.d`.

* `xaxisValuesRotate`  - *Optional*, default value is empty.
                         It is a wrapper to provide value for 
                         the `XAXIS` statement, for `ValuesRotate` option.

* `xaxisOther`         - *Optional*, default value is empty.
                         It is a wrapper to provide value for 
                         the `XAXIS` statement options not mentioned above.
                         You can use it to provide, e.g., `GRID`, 'MINOR',
                         `MINORGRID`, `GRIDATTRS=`, or `MINORGRIDATTRS=`.

* `sganno`             - *Optional*, default value is empty.
                         keeps name of a data set for the `sganno=` option
                         of the SGPLOT procedure.

* `sgPlotOptions`      - *Optional*, default value is `noautolegend noborder subpixel`.
                         List of additional options values for SGPLOT procedure.

* `odsGraphicsOptions` - *Optional*, default value is empty.
                         List of additional options values for `ODS Graphics` statement.
                         By default only the: `width=`, `height=`, and `antialiasmax=`
                         are modified.

* `vertical`           - *Optional*, default value is `0`.
                         Set value to `1` to plot "clouds & boxes" vertically.
                         **NOTE:** *Before setting the parameter to `1`, first
                         prepare the plot in the "horizontal" version since all
                         other parameters assume that orientation(!) and then are 
                         converted accordingly.*

***Stat related options***:

* `VSCALE`             - *Optional*, default value `Proportion`.
                         Specifies the scale of the vertical axis.
                         Allowed values are `PROPORTION`, `PERCENT`, and `COUNT`.
                         `PROPORTION` scales the data in units of proportion of observations per data unit. 
                         `PERCENT` scales the data in units of percent of observations per data unit. 
                         `COUNT` scales the data in units of the number of observations per data unit.

* `KERNEL_K`           - *Optional*, default value `NORMAL`. 
                         Specifies type of kernel function to compute kernel density estimates.
                         Allowed values are `NORMAL`, `QUADRATIC`, and `TRIANGULAR`. 


* `KERNEL_C`           - *Optional*, default value `1`.
                         Specifies standardized bandwidth parameter *C* to compute kernel density estimates.
                         Allowed values are between `0` and `1`,

* `VSCALEmax`          - *Optional*, default value is empty.
                         Provides maximal value for Y2 axis scale. When used an information
                         note is displayed in the log. Recommended values are:
                         For `VSCALE=PERCENT` between 0 and 100,
                         for `VSCALE=PROPORTION` between 0 and 1, and
                         for `VSCALE=COUNT` between 0 and N (sample size).

* `minRange`           - *Optional*, default value is `.` (numerical missing).
                         Indicates minimum value for x-axis on the plot, by default calculated form data.
                         Is a global parameter used for all plots.

* `maxRange`           - *Optional*, default value is `.` (numerical missing).
                         Indicates maximum value for x-axis on the plot, by default calculated form data.
                         Is a global parameter used for all plots.

***Other options***:

* `cleanTempData`      - *Optional*, default value `1`.
                         Indicates if temporary data sets should be deleted.

* `codePreview`        - *Optional*, default value `0`.
                         Indicates if source code should be MPRINTed to log.

* `reuseN`             - *Optional*, default value `6`.
                         Indicates how many times colours list should be repeated
                         so that colours could be reused in case the number of groups
                         is greater than the colours list length.

---

### NOTES: ###################################################################

* Default value of the `title` option is:
  `%nrstr(title1 JUSTIFY=C "Rain Cloud plot for &list_g. by "  %unquote(&xLabel.);)`
  Use the `%str()` or `%nrstr()` macro-function to handle special characters.
  The `%unquote()` is used when resolving the parameter.

* Default value of the `footnote` option is:
  `%nrstr(footnote1 JUSTIFY=L COLOR=lightGray HEIGHT=1 "by RainCloudPlot macro from the BasePlus package";)`
  Use the `%str()` or `%nrstr()` macro-function to handle special characters.
  The `%unquote()` is used when resolving the parameter.

* The `catLabels` and `xLabels` should be quoted comma-separated lists enclosed with brackets, 
  e.g. `catLabels=("Continent of Origin", "Car Type")`, see Example below.

* The `catLabelAttrs` and `xLabelAttrs` should be space-separated lists of `key=value` pairs,
  e.g. `xLabelAttrs=size=12 color=Pink weight=bold`, see Example below.

* Kernel density estimates and basic statistics are calculated with `PROC UNIVARIATE`.

* Plot is generated by `PROC SGPLOT` with `BAND`, `SCATTER`, and `POLYGON` plots.

* After execution the ODS graphics dimension parameters are set to `800px` by `600px`.

* SAS notes (`NOTE:`) are disabled for the execution time.

* Before setting the `vertical=` parameter to `1`, first prepare the plot 
  in the "horizontal" version since all other parameters assume that orientation(!) 
  and then are converted accordingly.

* List of predefined colours is: 
  `BlueViolet`, `RoyalBlue`, `OliveDrab`, `Gold`, `HotPink`, `Crimson`,  
  `MediumPurple`, `CornflowerBlue`, `YellowGreen`, `Goldenrod`, `Orchid`, `IndianRed`.

### BOX-AND-WHISKERS PLOT: ###################################################################

The box-and-whiskers plot has the following interpretation: 
- left vertical bar indicates the *minimum*,
- left whisker line starts at `max(Q1 - WS*IQR, minimum)` and ends at lower quartile (Q1),
- diamond indicates mean,
- vertical bar inside of the box indicates median,
- right whisker line starts at upper quartile (Q3) and ends at `min(Q3 + WS*IQR, maximum)`,
- right vertical bar indicates the *maximum*.

The `WS` value is provided through `whiskerScale=` parameter. Default value is `1.5`.

With above setup it may happen that 
there is a gap between the minimum marker and the beginning of the left whisker 
or 
there is a gap between the end of the right whisker and the maximum marker. 
See examples below.

---

*//*** HELP END ***/


%macro RainCloudPlot(
  DS
, gr
, vars

/* plot */
, WidthPX=1200
, HeightPX=220
, boxPlot=1
, roundFactor=0.00000001
, rainDropSize=5px
, boxPlotSymbolSize=8px
, boxPlotLineSize=1px
, boxPlotFill=1.0
, whiskerScale=1.5
, meanShiftLine=0
, meanShiftStep=0.1
, meanShiftColors=
, colorsList=
, monochrome=0
, antialiasMax=
, title=%nrstr(title1 JUSTIFY=C "Rain Cloud plot for &list_g. by "  %unquote(&xLabel.);)
, footnote=%nrstr(footnote1 JUSTIFY=L COLOR=lightGray HEIGHT=1 "by RainCloudPlot macro from the BasePlus package";)
, catLabels=
, xLabels=
, xBothAxis=1
, catLabelPos=DATACENTER 
, xLabelPos=DATACENTER
, catLabelAttrs=
, xLabelAttrs=
, formated=0
, y2axis=1
, y2axisLevels=4
, y2axisValueAttrs=Color=Grey
, y2axisFormat=12.2-L
, y2axisLines=0
, catAxisValueAttrs=Color=Black 
, xaxisValueAttrs=Color=Grey
, xaxisTickstyle=INSIDE
, xaxisValues=
, xaxisValuesDisplay=
, xaxisValuesFormat=
, xaxisValuesRotate=
, xaxisOther=
, sganno=
, odsGraphicsOptions=
, sgPlotOptions=noautolegend noborder subpixel
, vertical=0

/* stat */
, VSCALE=Proportion
, KERNEL_K=NORMAL /* NORMAL, QUADRATIC, TRIANGULAR */
, KERNEL_C=1
, VSCALEmax=
, minRange=.
, maxRange=.

/* other */
, cleanTempData=1
, codePreview=0
, reuseN=6
)
/
minoperator
;

%local tmp_notes;
%let tmp_notes = %sysfunc(getoption(notes)) %sysfunc(getoption(mprint));
/*%put &tmp_notes.;*/
options noNotes noMprint;


%local have base_have _prefix_ have_data_check;
%let _prefix_ = _%sysfunc(datetime(), hex16.)_;
%let have = work.&_prefix_.have;
%let base_have = work._base&_prefix_.have;


/* default value is 1 everything else goes to 0 */
%let boxPlot       = %sysevalf(1=%superq(boxPlot));
%let y2axis        = %sysevalf(1=%superq(y2axis));
%let cleanTempData = %sysevalf(1=%superq(cleanTempData));
%let vertical      = %sysevalf(1=%superq(vertical));
%let xBothAxis     = %sysevalf(1=%superq(xBothAxis));

/* default value is 0 everything else goes to 1 */
%let y2axisLines   = %sysevalf(NOT(0=%superq(y2axisLines)));
%let codePreview   = %sysevalf(NOT(0=%superq(codePreview)));
%let monochrome    = %sysevalf(NOT(0=%superq(monochrome)));
%let formated      = %sysevalf(NOT(0=%superq(formated)));
%let meanShiftLine = %sysevalf(NOT(0=%superq(meanShiftLine)));



data _null_; run;

%if 1=&codePreview. %then
  %do;
    %put INFO: Enabeling MPRINT;
    options Mprint; run;
  %end;

%if %superq(sganno) NE %then
%do;
  %if NOT %sysfunc(exist(%superq(sganno))) %then
    %do;
      %put INFO: There is a problem with input data!;
      %put %str(      )The sganno data set: %superq(sganno) does not exist!;
      %put %str(      )Stopping.;
      %GOTO noRainCloudPlotToday;
    %end;
%end;

%if NOT %sysevalf(0 < %superq(KERNEL_C) AND %superq(KERNEL_C) <= 1, boolean) %then
%do;
    %put ERROR: There is a problem with KERNEL_C parameter!;
    %put ERROR- The value has to satisfy: 0 < KERNEL_C <= 1, provided one is: %superq(KERNEL_C)!;
    %put ERROR- Stopping.;
    %GOTO noRainCloudPlotToday;
%end;

%if NOT (%qupcase(&KERNEL_K.) IN (NORMAL QUADRATIC TRIANGULAR)) %then
%do;
    %put ERROR: There is a problem with KERNEL_K parameter!;
    %put ERROR- The value has to be: NORMAL, QUADRATIC, or TRIANGULAR, provided one is: %superq(KERNEL_K)!;
    %put ERROR- Stopping.;
    %GOTO noRainCloudPlotToday;
%end;


data &base_have.;
  set &ds.;
  keep &gr. &vars.;
run;
%let have_data_check = &syserr.;

%if &have_data_check. %then
%do;
  %put INFO: There is a problem with input data!;
  %put %str(      )Stopping.;
  %GOTO noRainCloudPlotToday;
%end;

%if %superq(colorsList) = %then
  %do;
    %if 1 = &monochrome. %then
      %do;
        %let colorsList = Black DarkSlateGrey DimGray Gray DarkGray Silver LightGray Gainsboro;
      %end;
    %else
      %do;
        %let colorsList = BlueViolet RoyalBlue OliveDrab Gold HotPink Crimson  
                          MediumPurple CornflowerBlue YellowGreen Goldenrod Orchid IndianRed;
      %end;
  %end;

%if %superq(meanShiftColors) = %then
  %do;
    %let meanShiftColors = &colorsList.;
  %end;

%if %superq(xaxisTickstyle) ne %then
      %do;
        %if not (%qupcase(&xaxisTickstyle.) in (OUTSIDE INSIDE ACROSS INBETWEEN)) %then
          %do;
            %put NOTE! The xaxisTickstyle set to missing.;
            %let xaxisTickstyle=;
          %end;
      %end;

%local _x_ _y_ _slope_;
%if 1 = &vertical. %then
%do;
  %let _x_ = y;
  %let _y_ = x;
  %let _slope_ = .;
%end;
%else
%do;
  %let _x_ = x;
  %let _y_ = y;
  %let _slope_ = 0;
%end;


%local g x gi xj catLabel xLabel;
%do gi = 1 %to %sysfunc(countw(&gr.));
 %do xj = 1 %to %sysfunc(countw(&vars.));
/*+++++++++++++*/
 %let g = %scan(&gr., &gi.);
 %let x = %scan(&vars., &xj.);

 %let catLabel =;
 %let xLabel   =;
 %let catLabel = %qscan(%superq(catLabels), &gi., %str((,)), Q);
 %let xLabel   = %qscan(%superq(xLabels), &xj., %str((,)), Q);

%if %superq(catLabel) = %then
  %do;
    %let catLabel = "&g.";
  %end;

%if %superq(xLabel) = %then
  %do;
    %let xLabel = "&x.";
  %end;

/* base data */
data &have.;
  set &base_have.(keep = &g. &x.);
run;

/* kernel plot */
%local univariateError;
ods select none;
proc univariate data = &have.(keep = &g. &x.) noprint PLOTS() ;
  class &g.;
  var &x.;
  histogram / OUTKERNEL=work.&_prefix_.kernelData
              kernel(k=&kernel_k. c=&kernel_c.)
              VSCALE=&vscale.
              /*NMIDPOINTS = &NMIDPOINTS.*/
              NOPLOT 
            ;
  %if 1=&boxPlot. %then
    %do;
      output out = work.&_prefix_.basicStats
        MIN=min Q1=q1 MEAN=MEAN MEDIAN=MEDIAN Q3=q3 MAX=max;
    %end;
run;
%let univariateError=&syserr.;
ods select all;

%if &univariateError. %then
%do;
  %put ERROR: Proc Univariate error for &g. and &x.!;
%end;
%else 
%do; /*- no univariate error --------------------------------------------*/

proc sort
  data = work.&_prefix_.kernelData
  out = work.&_prefix_.kernelData
;
  by descending &g. _value_ _&vscale._;
run;

proc sql noprint;
  select 
    %if %superq(VSCALEmax)= %then
      round(max(_&vscale._), &roundFactor.) + &roundFactor.
    ;
    %else
      %do;
        %put INFO: VSCALE=&VSCALE.. Maximum value used: %superq(VSCALEmax);
        %put %str(      )Recommended values are:;
        %put %str(      )- VSCALE=PERCENT between 0 and 100.;
        %put %str(      )- VSCALE=PROPORTION between 0 and 1.;
        %put %str(      )- VSCALE=COUNT between 0 and N (sample size).;
        &VSCALEmax.
      %end;
    /**/
  , coalesce(&minRange.,min(_value_))
  , coalesce(&maxRange.,max(_value_))
  into 
    :_MAX_
  , :minX
  , :maxX
  from 
    work.&_prefix_.kernelData
  ;
quit;
/*
%put &=_MAX_.;
%put &=minX.;
%put &=maxX.;
*/

/* used for Y2 axis values and lines */
%local jump jumpFunc;
%if %upcase(&vscale.) = COUNT %then
   %let jumpFunc = round;
%else
    %let jumpFunc = abs;

%let jump = %sysfunc(&jumpFunc.(&_MAX_. / &y2axisLevels.));


data 
  work.&_prefix_.kernelData (where=(x between &minX. and &maxX.))
  work.&_prefix_.hLine(keep = &g.)
;
  set work.&_prefix_.kernelData;
  by notsorted &g.;
      retain plus -%sysevalf(1 *&_MAX_.);
      plus + first.&g. * &_MAX_. * 2;
      x = _value_;
      yu = _&vscale._ + plus;
      yl = plus;
      output work.&_prefix_.kernelData;
      if last.&g. then output work.&_prefix_.hLine;
  keep &g. x y:;
run;

/* rain drops */
proc sort
  data = &have.(keep = &g. &x.)
  out = work.&_prefix_.rainDrops;
  by descending &g. &x.;
run;

data work.&_prefix_.rainDrops;
  set work.&_prefix_.rainDrops;
  by notsorted &g.;
  retain plus -%sysevalf(1 *&_MAX_.);
  plus + first.&g. * &_MAX_. * 2 ;
  x = &x.;
  y1 = plus - ((0.1 + ranuni(42) * 0.2) * &_MAX_.);
  keep &g. x y1;
run;

/* categories and values */
data _null_;
  if 0 then set work.&_prefix_.hLine(keep = &g.) NOBS=NOBS;
  call symputX("no_g", NOBS, "L");
  call symputX("gFormat", vformat(&g.), "L");
  stop;
run;

%local list_g qlist_g clist_g;
proc sql noprint;
  select 
    %if 1 = &formated. %then &g.; 
                       %else catt(&g.);
    as g
  , list
  , &g. as g2
  into 
    :list_g separated by ", "
  , :qlist_g separated by " ' ' "
  , :clist_g separated by " "
  from (
    select 
      distinct &g.
    , %if 1 = &formated. %then quote(strip(put(&g.,&gFormat.)));
                         %else quote(catt(&g.)); 
      as list 
    from work.&_prefix_.hLine(keep = &g.)
  )
  order by &g. desc
  ;

  select 
    %if 1 = &formated. %then &g.; 
                       %else catt(&g.);
    as g
  into 
    :list_g separated by ", "
  from (
        select distinct &g.
        from work.&_prefix_.hLine(keep = &g.)
       )
  order by &g.
  ;
quit;

/* horizontal lines */
data work.&_prefix_.hLine;
  set work.&_prefix_.hLine(keep = &g.);
  by notsorted &g.;
  retain plus -%sysevalf(1 *&_MAX_.);
  plus + first.&g. * &_MAX_. * 2 ;

  y4_hLine = plus; /* to make horizontal lines */
  output;
  y4_hLine = plus - (0.75 * &_MAX_.); /* to make horizontal lines */
  output;

  keep &g. y4_hLine;

  %if 1=&y2axisLines. %then
    %do;
      y4_hLine=.;
      do j = 1 to &y2axisLevels.;
        y4_y2axisLine = plus + j*&jump.;
        output;
      end;
      keep y4_y2axisLine;
    %end;
run;

/* box plot data */
%if 1=&boxPlot. %then
  %do;
    proc sort
      data = work.&_prefix_.basicStats
      out = work.&_prefix_.basicStats;
      by descending &g.;
    run;

    data work.&_prefix_.basicStats;
      set work.&_prefix_.basicStats;
      by notsorted &g.;
      retain plus -%sysevalf(1 *&_MAX_.);
      plus + first.&g. * &_MAX_. * 2 ;
      y1 = plus - (0.5 * &_MAX_.);
      drop plus;
    run;

    data work.&_prefix_.boxPlot;
      set work.&_prefix_.basicStats;
      by notsorted &g.;

      y2u = y1 + 0.1 * &_MAX_.;
      y2l = y1 - 0.1 * &_MAX_.;

      y2 = y2u;
        do x  = median, q3;
          output;
        end;
      y2 = y2l;
        do x  = q3, median;
           output;
        end;

      y2 = y2u;
        do x  = median, q1;
          output;
        end;
      y2 = y2l;
        do x  = q1, median;
           output;
        end;

      keep &g. x y2;
    run;

    data work.&_prefix_.boxWhiskers1(keep = &g. x y31)
         work.&_prefix_.boxWhiskers2(keep = &g. x y32)
      ;
      set work.&_prefix_.basicStats;
      by notsorted &g.;

      iqr = abs(q3-q1);
      ws = max(input(symget('whiskerScale'), ?? best32.),0); /* scales whiskers */
      y31 = y1;
      x   = q3; output work.&_prefix_.boxWhiskers1;
      x   = min(q3+ws*iqr,max); output work.&_prefix_.boxWhiskers1;
      x   = q3; output work.&_prefix_.boxWhiskers1;

      y32 = y1;
      x   = q1; output work.&_prefix_.boxWhiskers2;
      x   = max(q1-ws*iqr,min); output work.&_prefix_.boxWhiskers2;
      x   = q1; output work.&_prefix_.boxWhiskers2;
    run;

    %if 1=&meanShiftLine. %then
      %do;
        data work.&_prefix_.MEANshift(where=(rms is not null));
          set work.&_prefix_.basicStats(
                keep = y1 mean &g.
                rename =(y1=y mean=x)
              );
            r+1;

            xp=lag(x);
            yp=lag(y);
            rp=lag(r);

            if 1 < _N_;

            do i = 0 to 1 by &meanShiftStep.;
              output;
              xms= xp + (x-xp)*i;
              yms= yp + (y-yp)*i;
              output;
              rms= rp + (r-rp)*i;
            end;

          keep  xms yms rms &g.;
        run;
      %end;
  %end;

/* final data for plotting */
data work.&_prefix_.RainCloudPlot;
  set
    work.&_prefix_.kernelData
    work.&_prefix_.rainDrops
    work.&_prefix_.hLine 
    %if 1=&boxPlot. %then
      %do;
        work.&_prefix_.basicStats(keep=&g. y1 min mean max)
        work.&_prefix_.boxPlot
        work.&_prefix_.boxWhiskers1
        work.&_prefix_.boxWhiskers2
        %if 1=&meanShiftLine. %then
          %do;
            work.&_prefix_.MEANshift;
          %end;
      %end;
  ;
run;

%if %superq(ANTIALIASMAX)= %then
%do;
  proc sql noprint;
    select count (*) into :ANTIALIASMAX 
    from work.&_prefix_.RainCloudPlot
  ; 
  quit;
%end;

ods graphics / 
  %if 1=&vertical. %then
    %do;
      height = &WidthPX.px 
      width = %sysfunc(max(%sysevalf(&no_g. * &HeightPX.), 600))px
    %end;
  %else
    %do;
      width = &WidthPX.px 
      height = %sysfunc(max(%sysevalf(&no_g. * &HeightPX.), 600))px
    %end;
  ANTIALIASMAX=&ANTIALIASMAX.

  %if %superq(odsGraphicsOptions) NE %then
    %do;
      %unquote(&odsGraphicsOptions.)
    %end;

;
%let AntiAliasMAX=;

%unquote(&title.)
;
%unquote(&footnote.)
;

proc sgplot 
  data = work.&_prefix_.RainCloudPlot 

  /* add annotations */
  %if %superq(sganno) ne %then 
    %do;
      sganno=&sganno.
    %end;

  /* aditional options */
  %if %superq(sgPlotOptions) NE %then
    %do;
      %unquote(&sgPlotOptions.)
    %end;
;

  %local i;
  styleattrs
    DATACONTRASTCOLORS = (%do i = 1 %to &reuseN.; &colorsList. %end;)
    DATACOLORS         = (%do i = 1 %to &reuseN.; &colorsList. %end;)
  ;

  band &_x_.=x upper=yu lower=yl
  / GROUP = &g.
    FILL
    OUTLINE 
    LINEATTRS = (color=cxfafafa thickness=1px pattern=solid)
  ;

  %if 1 = &y2axis. AND 1=&y2axisLines. %then 
  %do;
  LINEPARM &_x_. = &minX. &_y_. = y4_y2axisLine SLOPE = &_slope_. 
  / GROUP = &g.
    LINEATTRS = (color=lightgray thickness=1px pattern=solid)
    TRANSPARENCY=0.5 
    &_y_.2axis;
  ;
  /* to have plot over horizontal lines*/
  band &_x_.=x upper=yu lower=yl
  / GROUP = &g.
    FILL
    noOUTLINE 
    LINEATTRS = (color=cxfafafa thickness=1px pattern=solid)
  ;
  %end;

  &_y_.axis DISPLAY = (noline noticks /*nolabel*/) 
    LABEL = %unquote(&catLabel.) 
    LABELPOS=&catLabelPos.
    %if %superq(catLabelAttrs) ne %then
      %do;
        LABELATTRS=(&catLabelAttrs.)
      %end;
    OFFSETMIN = 0.01 
    OFFSETMAX = 0.03
    VALUES = (0 to %sysevalf((&no_g. * 2) *&_MAX_.) by &_MAX_.)
    VALUESDISPLAY = (" " &qlist_g. " ")
    VALUEATTRS=(&catAxisValueAttrs.)
  ;

  %if 1 = &y2axis. %then 
  %do;
    /* &jump. macro variable is calculated earlier */

    &_y_.2axis DISPLAY = (noline noticks)
      LABEL = "%lowcase(&VSCALE.)" 
      LABELPOS=&catLabelPos. 
      LABELATTRS=(Color=Grey)   
      OFFSETMIN = 0.01 
      OFFSETMAX = 0.03
      VALUES = (0
      %local i j;   
        %do i = 1 %to %sysevalf(&no_g.);
          %do j = 0 %to &y2axisLevels.;
            %sysevalf(&_MAX_. + 2*&_MAX_.*(&i.-1) + &j.*&jump.)
          %end;
        %end;
      )
      VALUESDISPLAY = (" "
      %local i j;   
        %do i = 1 %to %sysevalf(&no_g.);
          %do j = 0 %to &y2axisLevels.;
            %if &j. > 0 %then
              %do;
                "%sysfunc(abs(&j. * &jump.), &y2axisFormat.)"
              %end;
            %else " ";
          %end;
        %end;
      )
      VALUEATTRS=(&y2axisValueAttrs.)
    ;
  %end; 

  &_x_.axis 
    LABELPOS=&xLabelPos. 
    LABEL = %unquote(&xLabel.) 
    %if %superq(xLabelAttrs) ne %then
      %do;
        LABELATTRS=(&xLabelAttrs.)
      %end;
    min=&minX. 
    max=&maxX. 
    %if %superq(xaxisTickstyle) ne %then
      %do;
        TICKSTYLE=&xaxisTickstyle.
      %end;
    VALUEATTRS=(&xaxisValueAttrs.) /*xaxisValueAttrs= Color=Grey*/
    %if %superq(xaxisValues) NE
      %then VALUES=&xaxisValues.;
    %if %superq(xaxisValuesDisplay) NE
      %then VALUESDISPLAY=&xaxisValuesDisplay.;
    %if %superq(xaxisValuesFormat) NE
      %then VALUESFORMAT=&xaxisValuesFormat.;
    %if &vertical = 0 AND %superq(xaxisValuesRotate) NE
      %then VALUESROTATE=&xaxisValuesRotate.;
    %if %superq(xaxisOther) NE
      %then &xaxisOther.;
  ;

  %if 1 = &xBothAxis. %then
    %do;
      &_x_.2axis DISPLAY=(nolabel) 
        MIN=&minX. 
        MAX=&maxX. 
        %if %superq(xaxisTickstyle) ne %then
          %do;
            TICKSTYLE=&xaxisTickstyle.
          %end; 
        VALUEATTRS=(&xaxisValueAttrs.)
        %if %superq(xaxisValues) NE
          %then VALUES=&xaxisValues.;
        %if %superq(xaxisValuesDisplay) NE
          %then VALUESDISPLAY=&xaxisValuesDisplay.;
        %if %superq(xaxisValuesFormat) NE
          %then VALUESFORMAT=&xaxisValuesFormat.;
        %if &vertical = 0 AND %superq(xaxisValuesRotate) NE
          %then VALUESROTATE=&xaxisValuesRotate.;
        %if %superq(xaxisOther) NE
          %then &xaxisOther.;
      ;
    %end;
  %else
    %do;
      &_x_.2axis DISPLAY=(NOLABEL NOLINE NOTICKS NOVALUES);
    %end;

  scatter &_x_. = x &_y_. = y1
  / group=&g. 
    MARKERATTRS=(symbol=CircleFilled size=&rainDropSize. /*5px*/ )
    %if 1 = &xBothAxis. %then &_x_.2axis ;
  ;

  LINEPARM &_x_. = &minX. &_y_. = y4_hLine SLOPE = &_slope_. 
  / GROUP = &g.
    LINEATTRS = (thickness=1px pattern=solid)
    TRANSPARENCY=0.5 
    %if 1 = &y2axis. %then %do; &_y_.2axis %end;
  ;

  %if 1=&boxPlot. %then
    %do;
      /* mean shift line */
      %if 1=&meanShiftLine. %then
        %do;
          series &_x_.=xms &_y_.=yms /
            group = rms
            COLORRESPONSE = rms
            colormodel=(
              %local i mscN;
              %let mscN = %sysfunc(countw(&meanShiftColors.));
              %do i = 0 %to %sysevalf(&no_g.-1);
                %scan(&meanShiftColors., %sysevalf(%sysfunc(mod(&i.,&mscN.))+1) ,%str( ))
              %end;
            )
            lineattrs=(thickness=&boxPlotLineSize. pattern=solid) 
            transparency=0.0
          ;
        %end;

      polygon &_x_.=x &_y_.=y2 ID=&g.                                           
      / group=&g. outline lineattrs=(thickness=&boxPlotLineSize. pattern=solid) 
        FILL fillattrs=(transparency=&boxPlotFill.)
      ;
      polygon &_x_.=x &_y_.=y31 ID=&g.
      / group=&g. outline lineattrs=(thickness=&boxPlotLineSize. pattern=solid) 
        FILL fillattrs=(transparency=&boxPlotFill.)
      ;
      polygon &_x_.=x &_y_.=y32 ID=&g.
      / group=&g. outline lineattrs=(thickness=&boxPlotLineSize. pattern=solid) 
        FILL fillattrs=(transparency=&boxPlotFill.)
      ;
       
      symbolchar name=vertBar char='01C0'x / SCALE=2 rotate=%sysevalf(90*&vertical.);
      scatter &_x_. = min  &_y_. = y1 / group=&g. MARKERATTRS=(symbol=vertBar size=%scan(&boxPlotSymbolSize., -1) /*8px*/ );
      scatter &_x_. = mean &_y_. = y1 / group=&g. MARKERATTRS=(symbol=Diamond size=%scan(&boxPlotSymbolSize.,  1) /*8px*/ );
      scatter &_x_. = max  &_y_. = y1 / group=&g. MARKERATTRS=(symbol=vertBar size=%scan(&boxPlotSymbolSize., -1) /*8px*/ );
    %end;

run;

title; 
footnote;

ods graphics / width=800px height=600px;

%if 1 = &cleanTempData. %then
%do;
  proc delete
  data = &have.
    work.&_prefix_.kernelData
    work.&_prefix_.RainCloudPlot
    work.&_prefix_.rainDrops
    work.&_prefix_.hLine
    %if 1=&boxPlot. %then
      %do;
        work.&_prefix_.basicStats
        work.&_prefix_.boxPlot
        work.&_prefix_.boxWhiskers1
        work.&_prefix_.boxWhiskers2
        %if 1=&meanShiftLine. %then
          %do;
            work.&_prefix_.MEANshift
          %end;
      %end;
  ;
  run;
%end;

%end; /*- no univariate error --------------------------------------------*/

/*+++++++++++++*/
 %end;
%end;

proc delete data = &base_have.;
run;
data _null_;
run;

/* end of processing */
%noRainCloudPlotToday:

options &tmp_notes.;
run;
%mend RainCloudPlot;


/*** HELP START ***//*
 
### EXAMPLES AND USECASES: ####################################################

**EXAMPLE 1.** Simple Rain Cloud Plot for a `have` dataset:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~sas

  data have;
    g = "Aaa";
    do _N_ = 1 to 50;
      x = rannor(42);
      y = ranuni(42);
      output;
    end;

    g = "Bb";
    do _N_ = 1 to 120;
      select (mod(_N_,9)); 
        when(1,2,3,4,5) x = 0.5*rannor(42)+1;
        when(6,7,8)   x = 0.5*rannor(42)+3;
        otherwise   x = 0.5*rannor(42)+5;
      end; 
      y = ranuni(42)+1;
      output;
    end;

    g = "C";
    do _N_ = 1 to 60;
      x = 3*rannor(42)+7;
      y = ranuni(42)+2;
      output;
    end;
  run;

  %RainCloudPlot(have, g, x y)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The output can be seen in the `md` file.
![Example 1x](./baseplus_RainCloudPlot_Ex1x.png)
![Example 1y](./baseplus_RainCloudPlot_Ex1y.png)


**EXAMPLE 2.** Rain Cloud plot for `sashelp.cars` dataset
               with groups by Origin or Type 
               for Invoice variables:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~sas

  %RainCloudPlot(
    sashelp.cars(where=(Type ne "Hybrid"))
  , Origin Type
  , Invoice 
  , HeightPX=300 
  , y2axisLevels=3
  , catLabels=("Continent of Origin", "Car Type")
  , xLabels="Invoice, [$]"
  , xLabelAttrs=size=12 color=Pink weight=bold
  )

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The output can be seen in the `md` file.
![Example 2a](./baseplus_RainCloudPlot_Ex2a.png)
![Example 2b](./baseplus_RainCloudPlot_Ex2b.png)

**EXAMPLE 3.** Rain Cloud plot with formated groups:
               and annotations.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~sas

  data annotation; 
    function="text";
    label="This graph is full(*ESC*){sup '2'} of annotations!";
    drawspace="graphpercent";
    rotate=30;
    anchor="center";
    textsize=32;
    x1=50;
    y1=50;
    textcolor="red"; 
    justify="center";
    textweight="bold";
    width=100;
    widthunit="percent";
  run; 

  proc format;
    value system
    1="Windows"
    2="MacOS"
    3="Linux"
    ;
  run;

  data test;
    do system = 1 to 3;
      do i = 1 to 50;
        x = 2*rannor(123)/system - system;
        output;
      end;
    end;
    format system system.;
  run;

  %RainCloudPlot(test, system, x
  , colorslist=CX88CCEE CX44AA99 CX117733
  , formated=1
  , sganno=annotation
  , sgPlotOptions=noborder
  , WidthPX=1000
  , HeightPX=380
  , catAxisValueAttrs=Color=Green weight=bold
  , VSCALE=percent
  , cleanTempData=0
  , y2axisLevels=5
  , y2axisFormat=words20.-L
  , VSCALEmax=60
  , codePreview=1
  , y2axisValueAttrs=Color=Grey size=10px
  , y2axisLines=1
  , xLabels="variable X"
  , title = %nrstr(title1;
                   title2 JUSTIFY=L "Rain Cloud plot for &list_g. by "  %unquote(&xLabel.);
                   title3 JUSTIFY=C "Rain Cloud plot for &list_g. by "  %unquote(&xLabel.);
                   title4 JUSTIFY=R "Rain Cloud plot for &list_g. by "  %unquote(&xLabel.);
            )
  )

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The output can be seen in the `md` file.
![Example 3](./baseplus_RainCloudPlot_Ex3.png)


**EXAMPLE 4.** Rain Cloud plot for `sashelp.cars` dataset
               with groups by Drive Train for Weight (LBS) 
               variable ploted "vertically":
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~sas

  %RainCloudPlot(
    sashelp.cars
  , DriveTrain
  , Weight 
  , HeightPX=400 
  , colorslist=Red Green Blue
  , y2axisLevels=5
  , catLabels=("DriveTrain")
  , xLabels="Weight (LBS)"
  , xLabelAttrs=size=12 color=Black weight=bold
  , y2axisLines=1
  , vscale=percent
  , vscalemax=50
  , vertical = 1
  , title = %nrstr(title1 J=C HEIGHT=3 "The VERTICAL plotting is cool, ...";)
  , footnote = %nrstr(footnote1 J=L HEIGHT=2 "... isn't it?";)
  , whiskerScale = 1.5
  )

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The output can be seen in the `md` file.
![Example 4](./baseplus_RainCloudPlot_Ex4.png)

---

*//*** HELP END ***/

/**###################################################################**/
/*                                                                     */
/*  Copyright Bartosz Jablonski, since 2021.                           */
/*                                                                     */
/*  Code is under the MIT license. If you want - you can use it.       */
/*  But it comes with absolutely no warranty whatsoever.               */
/*  If you cause any damage or something - it will be your own fault.  */
/*  You've been warned! You are using it on your own risk.             */
/*  However, if you decide to use it don't forget to mention author.   */
/*  Bartosz Jablonski (yabwon@gmail.com)                               */
/*                                                                     */
/**###################################################################**/
