DATE DATA: Update 1
- steincarl
- Feb 28, 2021
- 3 min read

The master spreadsheet currently has 133 records on it.
It currently has 33 columns for each record.
Since my initial post, my primary task has been to focus headers and clean the data within for existing records. I continue to add/discover as many new events as possible. Occasionally someone (Dan May or Dani, in order of likeliness) will suggest another interesting column to track for some detail of an `Event.` Most recently having added `IsNaturalDisaster` as proposed by Dani, when her and I started running through a list of possible disaster movies that might have dates attached. 2012? Definitely. The Day After Tomorrow? Probably. There's probably a scene of Randy Quaid in his research satellite with a date stamp or something. Unstoppable? Face/Off? Volcano? Dante's Peak? TWISTER?
Adding new columns requires me to manually add that detail to 130+ records that already exist. When dealing with a (currently) small amount of records for a passion project, review rows to add a single field of data is hardly a concern. But I force myself to remember that adding a new column in an enterprise setting could affect, like, one billion rows. With so many records, when the table deals with money instead of fictional dates, using a table with flimsy design is hardly acceptable.
I have a few more records to add to the spreadsheet, so I'm not quite bone-dry on additions yet.
I have a slightly longer list of movies that possibly contain dates, or things from my memory that seem like they’d contain a date.
Just recently, I crossed Cloverfield off of that above list. I remembered the film was mostly found-footage style, and was right. I was able to grab the time stamp (May 22) from the night of the party when Cloverfield shows up and starts smashin NYC.
Dani and I spent pretty much the entirety of a rainy, cold, February weekend watching films, grabbing dates, and talking data. Deep Impact had a solid date drops that all related to the `E.L.E` (Extinction Level Event) evacuation and impact. I scrubbed a few films quickly, trying to catch dates where I thought they might appear. That was hit or miss. Armageddon. No date. Just a huge countdown timer. Never a date.
I shared some basic statistics on Instagram, in order to keep contributors engaged while also stoking the existence of this project:
tl;dr:
Percentage of calendar filled in: 36%
Percentage of dates with years: 67%
Records that take place in the future: 4
Percentage of `Events` are recorded as violent: 20%
Percentage of `Events` are recorded as disastrous: 28%
Percentage of `Events` that occur in fictional locations: 52%
Percentage of `Events` that are recorded as internet famous: 16%
The Date Data update Story was a thin follow-up to my initial request for people to keep their eyes and ears peeled to contribute. This post conveyed a baseline understanding of some of the more complex queries I intent to run as the project grows.
It is worth remembering that even though not many of my friends/followers responded to it, the people who might enjoy the creation of a "Pop Culture Data Vault" / Date Data aren't in my circle yet. I'm brainstorming what domains/handles to register to start broadcasting this project to the wider data community.
I’m reading The Elephant in the Refrigerator by John Giles as I begin the data vault design aspect of the project. The book is a welcome refresher to data vault basics and standards, especially now that I have actual data objectives.
A brand new dotted-grid Moleskin notebook was purchased tonight to work on designing the data vault aspects of this project. I've always kept a sketchbook. This project is different than any that have come before it, and its notebook should be similarly off-kilter.
I’ve started imaging/sketching different Hubs and what its Satellites would track on random graph paper that needs a home.
Ideally between blog posts, actual tables (still an .xlsx file), and a notebook for design, the Date Data project should be well documented for those following along.
Comments