Making database models for the Lokniti question data #18

JusticeV452 · 2024-05-01T14:59:58Z

Now that the codebook csv mapping question variables to question text #13 is finished, all the data needs to be added to the database in some easy-to-use format. Based on the structure of question response data, I propose we integrate the data across all years using three models: Some model representing an individual who responded ("Responder"), another model for each response of a "Responder" ("Responses"), and a model for the Codebook. For some more details about what each of the models should contain:

"Responder" (or some other variation): @vaeyias
Each row of the main data contains information that uniquely identifies a person and their responses to the survey questions.
The Responder model consists of only personal information, such as a person's state name, IDs, age, status, etc. The question responses will be stored in the "Responses" table instead of with the particular responder.
"Responses":
The Responses model contains all the responses from each responder and has attributes storing the year, question variable name, a foreign key to the responder, and a foreign key to a codebook entry
"Codebook":
The Codebook should contain the same attributes as the column names in the spreadsheet created from Make codebook for column names in lokniti question data #13 and can be created straightforwardly using the existing update_db command. Implementing this model first might make it easier to look up the variable name corresponding to the state name when creating the Responder model (each responder should have a state name attribute, although the variable name containing the state name varies from year to year).

We could have as many as three people working on this issue, each implementing one of the models above.

vaeyias · 2024-05-01T15:20:46Z

In the lokniti-data branch, I started each model/serializer, and the codebook model is complete. Responders is in-progress. Also, instead of using the "Resno" columns in the data files as the respondent number, we should use the first column because "Resno" columns have a lot of repeats (maybe resno does not mean respondent number?) I renamed the first column to "entry_no" in each file

JusticeV452 · 2024-05-01T15:34:24Z

I looked at the csvs again, and I think "resno" is per state or some other combination of pc/ac/ps id. Regardless, we should still have the resno attribute of the model reflect what it is in the csv, since it is one of the fields on the document that people needed to fill out. Django automatically assigns a unique id to every object added to a table, so there isn't a need to assign an id attribute per respondent if it's just the row number.

vaeyias · 2024-05-01T16:58:32Z

Codebook and Responders models are complete:

To load Codebook, run update_db
To load Responders run load_responders on each of the four NES data files

Some sketchy things I have noticed:

Some respondent numbers, ages, ps_id's are nan; i set the default value for all of these to 0, but there is probably a better default value

JusticeV452 · 2024-05-01T17:10:40Z

Great, thanks! I think you should add the attributes (null=True, blank=True) to all the fields in the csv that might be nan. That way, the attributes can be assigned None instead of 0 to signify that they are blank and won't appear in general queries based on those values. If we want to find people who haven't responded, we could check explicitly using something like [model].objects.filter([attribute]__isnull=True).count()

JusticeV452 assigned vaeyias May 1, 2024

czheng10 self-assigned this May 2, 2024

vaeyias linked a pull request May 13, 2024 that will close this issue

Models for Lokniti Data #20

Merged

JusticeV452 closed this as completed in #20 May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making database models for the Lokniti question data #18

Making database models for the Lokniti question data #18

JusticeV452 commented May 1, 2024 •

edited

Loading

vaeyias commented May 1, 2024 •

edited

Loading

JusticeV452 commented May 1, 2024 •

edited

Loading

vaeyias commented May 1, 2024

JusticeV452 commented May 1, 2024

Making database models for the Lokniti question data #18

Making database models for the Lokniti question data #18

Comments

JusticeV452 commented May 1, 2024 • edited Loading

vaeyias commented May 1, 2024 • edited Loading

JusticeV452 commented May 1, 2024 • edited Loading

vaeyias commented May 1, 2024

JusticeV452 commented May 1, 2024

JusticeV452 commented May 1, 2024 •

edited

Loading

vaeyias commented May 1, 2024 •

edited

Loading

JusticeV452 commented May 1, 2024 •

edited

Loading