A categorized multimodal TikTok dataset

dc.contributor.authorWedel, Lion
dc.date.accessioned2023-11-24T10:30:15Z
dc.date.available2023-11-24T10:30:15Z
dc.date.collected2023-07-31/2023-08-04
dc.date.issued2023
dc.description.abstractThis dataset encompasses 11242 entries of 5137 unique videos listed between the 31st of July and the 4th of August on the TikTok explore page (https://www.tiktok.com/explore). The page was accessed via a German IP address without being logged in. The data has been collected via the 4CAT Toolkit and the Zeeschuimer browser extension. The dataset contains the category and multimodal embeddings for each video. **Intended Purpose** The dataset is primarily intended for proof-of-concept studies, as a toy dataset to teach or to be used for seminar papers by students. Given the lack of a clear definition for each category by TikTok, the focus of such work might be to explore those definitions or to conduct work with a focus on methods. The multimodal embeddings allow for directly applying unsupervised and supervised machine learning techniques. **Contents** The dataset consists of four zipped .csv files: * – metadata.zip * – text_embeddings.zip * – audio_embeddings.zip * – video_embedding.zip **For further details, please consult the Data Report** (datenbericht_v2.pdf).en
dc.identifier.citationWedel, L. (2023). A categorized multimodal TikTok dataset [Data set]. Weizenbaum Institute. https://doi.org/10.34669/WI.RD/3
dc.identifier.urihttps://www.weizenbaum-library.de/handle/id/420
dc.language.isoeng
dc.rightsopen access
dc.rights.urihttps://creativecommons.org/licenses/by-nc/4.0/
dc.subjectTikTok
dc.subjectMachine Learning
dc.subjectVideo
dc.subjectAudio
dc.titleA categorized multimodal TikTok dataset
dc.typeResearchData
dcmi.typeDataset
dcterms.bibliographicCitation.originalpublisherplaceBerlin
dcterms.contributor.datacollectorWedel, Lion
local.researchgroupDynamiken digitaler Nachrichtenvermittlung
local.researchtopicDigitale Märkte und Öffentlichkeiten auf Plattformen
Dateien
Originalbündel
Gerade angezeigt 1 - 5 von 5
Lade...
Vorschaubild
Name:
datenbericht_v2.pdf
Größe:
94.58 KB
Format:
Adobe Portable Document Format
Beschreibung:
Lade...
Vorschaubild
Name:
audio_embeddings.zip
Größe:
28.42 MB
Format:
Unknown data format
Beschreibung:
Lade...
Vorschaubild
Name:
meta_data.zip
Größe:
2.42 MB
Format:
Unknown data format
Beschreibung:
Lade...
Vorschaubild
Name:
text_embeddings.zip
Größe:
10.09 MB
Format:
Unknown data format
Beschreibung:
Lade...
Vorschaubild
Name:
video_embeddings.zip
Größe:
96.33 MB
Format:
Unknown data format
Beschreibung: