Tidy Data & Network Substructures

ECI 589 SNA and Education: Unit 3 Independent Study

Author

Shaun Kellogg

Published

March 7, 2023

The final week of each unit is designed to provide you some space for gaining proficiency with R and conducting an independent analysis. For students new to R and have not completed ECI 589: Intro to Learning Analytics, you will be required to complete three Posit Primers focused on data essential wrangling skills in R. For students who have completed ECI 586 or are already comfortable with R basics, you will be required to create a simple “data product” designed to illustrate key insights about about the student and/or teacher reported middle school friendship networks.

Posit Primers (R Beginners)

For students new to R, this week you will learn to how to make and use tidy data, the data format designed for R.

Assignment

Your assignment this week is to complete the following primers located in the Tidy Your Data section on Posit Cloud:

  1. Reshape Data. Data comes in many formats, but R prefers just one: Tidy Data. Learn to recognize and make tidy data in this tutorial, as well as how to reshape the layout of any data set.
  2. Separate and Unite Columns. Here you will learn to separate a column into multiple columns and to reverse the process by uniting multiple columns into a single column. Then you’ll practice your data wrangling skills on messy real world data.
  3. Join Data Sets. Complete your data wrangling education by learning to work with relational data. Here you will learn how to augment data sets with information from related data sets, as well as how to filter one data set against another.

Assessment

To assess your knowledge and skills, you will complete a brief online quiz worth 6 points. This is an “open book” quiz and you are welcome to refer to the course readings, primers and case study to complete. You will be able to attempt this quiz more than once, though penalties may be assigned for incorrect responses.

For this quiz you will need to navigate to the ECI 589 SNA and Education workspace, access the Unit 3 - Independent Study R Studio project assignment, and open the unit-3-quiz.R file to use as you're working through this quiz. This will allow you to type all of your code, add comments, and then save your .R script for reference as needed. In addition, some quiz items may ask you to open and manipulate a data file saved in the data folder. 

Independent Analysis (Prior R Experience)

For those who completed ECI 586 or are already comfortable with R, you will be required to conducted an independent analysis and create a simple “data product” designed to illustrate key insights about the student and/or teacher reported middle school friendship networks.

Assignment

For your independent analysis, your primary goal is to describe a social network both visually and numerically (including node-level measures) by applying the knowledge and skills acquired from the Unit 3 course readings and case study. Grading for this week is fairly lenient and you’ll receive 1 point (6 points total) for completing each of the following tasks:

  1. Identity a data source. I’ve included a data folder located in the Unit 3 Analysis Project in our RStudio Cloud ECI 589 Workspace that contains several Twitter datasets from focused on the Common Core State Standards and Next Generation Science Standards. You are also welcome use the datasets from our SNA and Education course text described here or identify your own network data source related to an area of professional interest. However, if you choose to use an alternative data source, you will need to specify the context in which it was collected and the audience for whom your analysis intended.

  2. Formulate a question. I recommend keeping this simple and limiting to no more than one or two questions. Your question(s) should be appropriate to your data set and ideally be answered by applying concepts and skills from our course readings and case study. For example, you may be interested in determining measures of centrality for a network and identifying key actors.

  3. Analyze the data. I highly recommend creating a new R script in the Unit 3 Independent project space to use as you work through data wrangling and analysis. Your R script will likely contain code that doesn’t make it into your R Markdown presentation or report since you will likely experiment with different approaches and figure out code that works and code that does not.

  4. Create a data product. When you feel you’ve wrangled and analyzed the data to your satisfaction, create an R Markdown file that includes a polished sociogram and/or data table and a narrative highlighting your research question, data source, and key findings and potential implications. Your R Markdown file should include a polished sociogram and/or table, a title and narrative, and all code necessary to read, wrangle, and explore your data.

  5. Share your findings. Render (Quarto) or knit (R Markdown) your data product to a desired output format. This may be a simple HTMLPDF, or MS Word file; or something more complex like HTML5 slides, a Tufte-style handoutdashboard, or website. Then create a new forum post below to share your analysis. Include in the post your published web link (e.g., via RPubs, GitHub Pages, Quarto Pub, or any other methods) or attached file (e.g. Word or PDF) and add a short reflection including one thing you learned and one thing you’d like to explore further.

Assessment

As noted in our Unit 1 Case Study, the final step in our workflow/process is sharing the results of analysis with wider audience. Krumm, Means, and Bienkowski (2018) outlined the following 3-step process for communicating with education stakeholders what you have learned through analysis:

  1. Select. Communicating what one has learned involves selecting among those analyses that are most important and most useful to an intended audience, as well as selecting a form for displaying that information, such as a graph or table in static or interactive form, i.e. a “data product.”

  2. Polish. After creating initial versions of data products, research teams often spend time refining or polishing them, by adding or editing titles, labels, and notations and by working with colors and shapes to highlight key points.

  3. Narrate. Writing a narrative to accompany the data products involves, at a minimum, pairing a data product with its related research question, describing how best to interpret the data product, and explaining the ways in which the data product helps answer the research question.

For the purpose of this assignment, imagine that you have been asked to share your research with teachers and administrators at the middle school in which this data was collected. Remember, your audience will have limited background in SNA and so adapt the process accordingly. This assignment is worth a total of 6 points and will be assessed based on the following criteria.

  1. Polished Sociogram (2 points). The sociogram you selected to share should be visually appealing and help illustrate for your audience similarities and differences in classroom friendships reported by teachers and/or students.

  2. Clear Narrative (2 points). Include a brief narrative to accompany your visualization and/or table that includes the following:

    • The question or questions guiding the analysis;
    • The conclusions you’ve reached based on our findings;
    • How your audience might use this information;
    • How you might revisit or improve upon this analysis in the future.
  3. Peer Feedback (2 points). Finally, take look at the posted data products of your peers and provide some brief (1 paragraph) but constructive feedback to at least two of your peers. Your feedback should include one thing you liked about their data product and why and one suggestion for improvement or extension of the analysis.

Troubleshooting

If you have any questions about the assignment or run into any technical issues, don’t hesitate to email me at sbkellog@ncsu.edu or set up a Zoom meeting at calendly.com/sbkellogg.

I also encourage you to post to the Questions & Troubleshooting thread, especially if others might benefit from the response to your question or issue. You are also more than welcome to respond to the questions posted by your peers.

References

Krumm, Andrew, Barbara Means, and Marie Bienkowski. 2018. Learning Analytics Goes to School. Routledge. https://doi.org/10.4324/9781315650722.