What Do the Terms “Structured” and “Unstructured” Data Actually Mean?

If you’re like most people in the world of analytics, you probably didn’t get a formal education on the topic. Sure, we all know a few data scientists, but most of us came from other backgrounds. When semi-familiar terms are tossed around, it’s easy to nod your head along with the crowd while in the back of your mind wondering, “does that mean what I think it means?”.

That brings me to the topic for this week. I was having a discussion with a friend recently about the difference between structured and unstructured data. Even though I work with data all the time, I had to look the definitions up to be sure I knew what I was talking about. So let’s get to it.


Structured Data

Structured data is any data that has a defined structure which can be easily scanned or analyzed for information. A spreadsheet or table of data in a server environment is a good example of that.


As an example, let’s say you wanted to know how many items we sell that fall under the “Tables” Category. You could easily come up with an answer using functions within Excel from the image above.


Unstructured Data

Unstructured data is generally much harder to get answers from than structured data. Data formats may not be defined and compilation is difficult.


For instance, let’s say I had a list of 100 PDF files and I wanted to know how many referred to servers somewhere in the document. Answering that question would require a very manual process.

More on Structured vs. Unstructured here.


What Does Tableau Prefer?

Tableau prefers data in a structured format. This makes filtering and analysis by different categories of data much simpler.


