Assignment Task | My Assignment Tutor

Assignment Task You work at Cadbury as a data scientist. The product development team have approached you because they want to develop a new line of chocolates. Cadbury has a long history in the confectionary market, but their target market has typically been pitched at the lower end “chocolate as a grocery item”. They are now looking to develop a range of chocolates for very discerning chocolate connoisseurs. This chocolate will be more expensive and will not be sold in supermarkets but rather through specialty stores or direct sales on a new website. The product development team aren’t sure what the characteristics of this new chocolate should have taste wise but know that they want it to have distinctive characteristics. An executive in the product development team at Cadbury head office has provided you with a dataset with all of the current producers and has asked you to provide a report with recommendations about what attributes this new chocolate could have. Note: not all columns are related to this purpose. First, the product development team would like to get a better understanding about what sorts of attributes the current providers beans have. They have asked you to describe the data and find interesting phenomena. Second, the product development team have asked you to explore the data in more detail. They would like you to use your expertise in data science to dig out anything you feel is interesting or significant. They are looking for attributes of the beans that could be put together to create a distinctive yet tasty chocolate. You are required to prepare a report about your findings and to make suggestions about which attributes you would recommend be considered in determining the provider of the cocoa. You are also required to provide the script of the code you have used to prepare and explore your data. The potential audiences of this report include other staff within Cadbury, such as executives or sales staff. These staff may have limited ICT or mathematical knowledge therefore the report should be technical but have clear explanations describing the findings. To prepare the report, please include the following sections: Introduction Provide an introduction to the problem. Include background material as appropriate: who cares about this problem, what impact it has, where does the data come from, what are the dimensions and structure of the data. Data Setup Describe how to load the data, and how the pre-processing is performed. The original dataset is not ready for analysis and it is different from the data forms that we are familiar with in previous practices. This means we need to do some pre-processing, either for the whole dataset, or for a subset of the dataset required for each sub task described later. Once you have some ideas of exploratory or advanced analysis, you need to adjust the form of dataset. This can be achieved either by manipulating records in R by transposition or subsetting, or with other tools (e.g. notepad or excel) before reading them into R. Please clearly explain the way you have cleaned the data in this section. If you use Excel please still explain the steps that you used for cleaning. Exploratory Data AnalysisTwo, one-variable analyses with graphs One-variable analysis studies one variable (one column) each time. You can choose the attribute you want to for this but the attributes you select need to add to the story you are telling about which cocoa to select. Perform 2 one-variable analyses and graph themExplain the findings for each graphProvide the code for each graph Two, two-variable analyses with graphs A two-variable analysis studies the relation between two variables. It is up to you to decide which attributes/variables you use for this analysis but the attributes you select need to add to the story you are telling about which cocoa to select. Perform 2 two-variable analyses and graph themExplain the findings for each graphProvide the code for each graph Advanced Analysis Two, Linear regression analyses with graphs Briefly explain the concept of linear regression (with references). It is up to you to decide which attribute/s you use for this analysis. You may choose to use one or two attributes for this but the values you select need to add to the story you are telling about which cocoa to you will recommend. Perform 2 linear regression analyses and graph themExplain the findings for each graphProvide the code for each graph Decision tree Briefly explain the concept of decision trees (with references). It is up to you to decide which attribute/s you use for this analysis. You may choose the attributes for this but the values you select need to add to the story you are telling about which cocoa to you will recommend. Create a decision tree and resulting visualisationExplain the findings for the decision treeProvide the code for the decision tree Conclusion Sum up your findings and provide some insight into the findings. Provide your overall recommendation/s in this section eg. which cocoa have you selected and why. Reflections In this part, discuss any difficulties you had performing the analysis and how you solved those difficulties. Reflect on how the analysis process went for you, what you learnt, and what you might do differently next time. Aim to write 2-4 paragraphs. For the data analysis (Section 3 & 4), you need to provide both R code, the explanation to the code, and the result. Please represent each R code snippet in your report using a box with some comments. For example: # Draw a boxplot on the attribute “Income” boxplot(MyData$income) The marking rubric is viewable on Blackboard. Report Format Your report should be no less than 1,200 words and it would be best to be no longer than ~2,000 words long. Texts in R code snippets are not counted. The report MUST be formatted using the following guidelines: Title Page – Include your name as the report’s author.Header – Report titleFooter – your name and the page numberParagraph text – 12 point Calibri or Times New Roman single line spacingHeadings – In an appropriate type and sizeMargins – 2.5cm on all marginsPage numbering – Introduction and onwards to use conventional numerals (1, 2, 3, 4) starting on page 1 from the introduction.The report is to be created as a single Microsoft Word document (version 2007 or later). No other format is acceptable and doing so will result in the deduction of marks. Please follow the conventions detailed in: Summers, J. & Smith, B., 2014, Communication Skills Handbook, 4th Ed, Wiley, Australia. Referencing References for the explanation of decision trees and linear regression are required. These references should follow the Harvard referencing. Note that ALL references should be from journal articles, conference papers, technical papers or a recognized expert in the field. Use the library databases or Google Scholar to find appropriate articles. DO NOT use Wikipedia as a reference. If you would like help on referencing check this link out. End of Assignment

QUALITY: 100% ORIGINAL PAPER – NO PLAGIARISM – CUSTOM PAPER

Leave a Reply

Your email address will not be published. Required fields are marked *