* A supermarket is offering a new line of organic products. The supermarket’s management wants to determine which customers are likely to purchase these products. * The supermarket has a customer loyalty program. As an initial buyer incentive plan, the supermarket provided coupons for the organic products to all of their loyalty program participants and collected data that includes whether or not these customers purchased any of the organic products. * The ORGANICS data set has 13 variables and over 22,000 observations. The variables in the data set are shown below with the appropriate roles and levels.
Define the data set ORGANICS as a data source for the project. Use basic option in metadata advisor options. c. Check the model role and measurement level for each variable in step5 of Metadata with the model roles and measurement levels of the table printed above. Point out any mismatch in your answer. (1 Point) d. Fix as needed (by clicking on role or level) any mismatch in the roles or measurement levels of variables in the above step. e. Continue with adding data source. Do not use decision processing step in the data source wizard. In the final step use data set role as Raw. f. Right-click on the data in the project panel and select Explore. Make sure you set sampling method to random and sample size to Max in sample properties. Scroll and take a look at the first 26 observations. Which variables seem to have missing values in the first 26 observations? Also, show a screenshot of at least first 26 observations. (1 Point) g. Instead of exploring data from the project panel, you will now do it from the diagram.
Make sure you set sampling method to random and sample size to Max. in the options > preferences. Drag the data to the diagram. Then, right-click on the data source in the diagram. Choose edit variables. In the pop-up box, select any variable you want to explore and then click on the Explore button. 1) Select TargetBuy. Create a frequency histogram for the variable TargetBuy. Make sure the vertical axis is percentages and you display the percentage values in the histogram (hint: right-click on the graph…). Turn-in a copy of the histogram as a part of your deliverable. (1.5 Points) 2) Create a frequency histogram for the variable DemGender. Comment on what you see in this histogram and turn-in a copy of the histogram as deliverable. (1 Point) 3) Create a frequency histogram for the variable DemAge. Comment on what you see in this histogram and turn-in a copy of the histogram as deliverable. (1 Point) 4) Create a frequency histogram for the variable PromSpend (make sure you change number of bins to 30).
Is this variable right or left skewed? Does that make sense based on the variable’s description? Turn-in a copy of the histogram as deliverable. (1.5 Points) This part is a stand-alone exercise using file import node. First, use the file import node to import the Excel data file Smalldata. Run the file import node and then answer following questions. h. Does any variable have a role of rejected? If yes, can you guess why it is rejected? (1 point) i. There are several variables that SAS EM may have assigned a measurement level of Interval. But, these should really be binary. Which are these variables and why should they have binary measurement level? Fix their levels to binary before answering the next question.(1 point) j. Attach a StatExplore node to the File Import node and run the Stat Explore node. Report the class variable summary statistics and the Interval variable summary statistics from the output window of the StatExplore node.(1 point)
Deliverables (please follow these instructions):
* Create a single MS word document (please make sure your name, CWID and section number are mentioned on the cover page of this document. Create a running header or footer with your name so it is there on every page and page number that is on every page. Show Screen shot (s) of relevant information and/or answer to each of the questions as appropriate. Upload this document in the appropriate drop box by the deadline mentioned in the appropriate drop box. Use 5% level of significance, unless stated otherwise in a problem.