×

Lorem Ipsum is simply dummy text of the printing and typesetting industry.

But What is Metadata?

TyperTech

Regular Member
Joined
Feb 26, 2024
Posts
44
Reaction score
0
Status
Offline
Last Seen
What Metadata actually is.

Few people actually discuss what metadata actually is. Probably because it's confusing. Or that there are at least several forms of metadata.

From philosophy we know that 'meta' means 'higher'. So metadata comes to us a a Signifer: it's a way to point at data that refers to a higher form of data but it is not the data itself.

There is Descriptive Metadata, that refers to the ISBN number of the book, but not the book itself or which is the latest version. But it gets us closer to the book, even if we are ten editions behind.

In Structural Metadata, we have page numbers, sections, chapters, indexed and tables of contents. Each of these things alone serve no basis but they all point above to their relational relative object, they all provide shorthand information about data about data. But in OpSec, the end-goal isn't the data. It's to be able to identify the data without having it present. It eliminates blockages that prevent the adversary from seeking the target.

The more of it we gather, the closer we come to the higher thing, the thing that's being signified.

In a computer file: the properties of a file, the file's name, the file type, where it's stored, when and at what time was it created and last modified, how much space it's taking up on the hard drive, who owns the file, and more.

In the content of social media "tags" and "symbols" like # and @ are metadata as they point out, call out to something higher themselves.

In Internet tracking: the type of device you use, your location, the time of day, your daily routine and interactions, your preferences, your associations, your habits and so much more can be used to picture something presentable to market products to you; the higher object that is the collective of all these things.

Metadata is NOT the data itself. Metadata is not the document or the photograph.

In photography, EXIF data is the metadata. This one is easy. The geolocation. The date and time the photo was captured. The file name, the exposure, all the camera settings, etc.

In a Word document, the relational objects are the metadata: the file name, the size, how long the it took to produce the file, the creation date, the last time it was modified, the original author, whom was it shared with? Not the document itself.

Same thing in a spreadsheet: the tab names, the table names, the column names, any relational database. Not the spreadsheet itself.

In an email, the head contains the metadata but the email itself is not the metadata. The obvious: subject line, to whom, from whom, when, where from, valid IPs?

There is a great PDF by M.I.T. that described metadata as "a record … of the data records": Computer approaches for handling large social science data files.⚠️.

As we can see metadata is a sort of taxonomy to classify data according to its value. It leaves an audit trail. Enabling people to identify the attributes of a person or thing. In short, a set of data that describes and gives information about the real data.

As I mentioned earlier, in philosophy, 'meta' means 'higher' so "Meta" is a word which, like so many other things, we have the ancient Greeks to thank for. When they used it, meta meant “beyond,” “after,” or “behind.” The “beyond” sense of meta still lingers in words like metaphysics or meta-economy.

So, this means that Data is nothing but the sum total of its metadata. It is what helps us create a complete picture of our data and understand it in its entirety.

Metadata for scientific research: includes information about test design, test population details, the definition of terms, measurement methods, and data collection schedules.

We can't get rid of metadata entirely -- we would turn to nothingness. Our RNA and DNA would no longer exist. We can't get rid of the properties of a file without damaging the file. I can alter a file only to an extent but if I changed an '.rar' to a '.txt' it would no longer function as it should.

The four main types I mentioned earlier are just more ways of taxonomy, to classify data: technical, operational, business and social. The basic things that make up human activity. There is nothing frightening about metadata. It means you live in the world of atoms.

It is data whose only purpose is to define and describe the data object it is linked to. The salinity of the ocean but not the ocean itself. You can't take away the salt without taking away what we all agree upon on as 'the ocean'.

Tor does a fantastic job of protecting your metadata, Tails protects your metadata, exifcleaner and exiftool gets rid of unnecessary metadata.

Facebook, Microsoft, Google and Apple thrive on collecting your metadata. They're not trying to spy on you, personally, they just need to know all your consumer habits, harvest it and broker it for their own benefit. It gets tough to do it manually so now artificial intelligence does the groundwork.

We use privacy guides and privacy tools to minimize what we dish out. But having no metadata means ... you're off the grid.

Not many people realize that there are two (2) Tors: the browser and the network infrastructure and they both minimize and neutralize our metadata.

The point of this post is that metadata isn't just data about data but data that surrounds the target so the adversary has a better understanding of the target when they go looking for it: the content or body of the email doesn't matter anymore, it's who we communicate with most, what our GPS patterns look like, what are sleeping pattern looks like, where we go spend our money, where they can find us ...
 
Top