How transparent is TikTok?
A detailed attempt to grade TikTok's overall transparency
The heat around TikTok has reached a near-boiling point in the U.S. and seems to only be escalating every day. There are now reports that the White House is demanding that TikTok either sell off its entire US operations or potentially face a complete ban (an authority the White House would like to have and which Congress might give them). And it all seems to be leading up to the CEO of TikTok testifying in front of Congress next week.
For TikTok, one of their many attempts to allay the concerns of lawmakers has been committing themselves to transparency.
Thanks for reading Some Good Trouble! Subscribe for free to receive new posts and support my work.
Not only are they talking about it constantly as a part of their external comms, but they’ve also launched new APIs, committed to external audits, talked publicly about their algorithm for the first time, and even built an actual, brick-and-mortar Transparency Center in Los Angeles.
According to Vanessa Pappas, their head of US policy, all of this is part of an attempt for TikTok to offer “unprecedented levels of transparency”.
But is she right?
It can sometimes be hard to differentiate between performative transparency and meaningful transparency. So, I thought I’d attempt a closer look at TikTok’s different transparency efforts to date, how they stack up against other companies, and to what degree they meet or don’t meet the expectations of the audience or other relevant stakeholders of each effort.
My hope is that this is actually the first in a series where can go through all the major platforms and their various; but if I don’t, the good news is that I think in the next year or so more organizations are going to step up and produce this sort of analysis on a much more regular and comprehensive basis (a kind of Ranking Digital Rights effort but focused specifically on transparency). So, this is partly just a stop-gap for the moment :).
Before we get into it, though, here’s a quick overview of the overall methodology I wanted to try and use:
Any sort of rating like this always involves some degree of subjectivity, so if you have any pushback or think any of this is analyzed incorrectly, please let me know in the comments or email me directly and I’ll make sure to highlight any differences in a future post. That also includes if I’ve just gotten anything wrong…please let me know. This is a big topic with a lot of complexity.
Transparency can mean a lot of different things but if you look across international efforts to legally mandate some sort of social media transparency over the last year or so, the different requirements tend to fall into six major buckets:
Accessible public data (including scraping)
Content moderation transparency*
What’s Not Covered For Now
That being said, there’s also no shortage of other important transparency areas, including ones that focus specifically on:
AI transparency (increasingly important for lots of obvious reasons),
government requests (including removals and law enforcement requests for data),
covert information operations,
public risk mitigations/risk assessments, and
There are also increasing calls for new transparency standards around app stores (where applicable), revenue sharing with creators, human rights reports, and actual content moderation operations (not just policies) that haven’t gotten as much attention as they probably deserve.
For now, though, I’m going to focus on the top 6 listed above. They’re the ones I personally know the best and I think have the most legislative and regulator attention at the moment. I’ve also added an additional category of “extra points” to catch any other transparency initiatives that a company might be doing that don’t fall neatly into the other 6 buckets.
For each category, I’m going to assign a score of 0-10, which means the best possible total transparency score you can get is 70. 0 is doing nothing and 10 is doing everything possible in the category & ideally even establishing new norms for the rest of the industry. I’ll do this basically totally subjective but try and lay out my thinking for each.
The Role of Audits
Independent auditing should be a baseline expectation of any of these reports & data sets and I hope it is something that becomes much more of an industry standard & legally required mechanism across the board in the coming years. But for now, if any of the efforts don’t include any external auditing, they can’t get higher than 8 in that particular category.
* Note: I’m using “content moderation” here but some platforms have catch-all transparency reports that cover a bunch of different categories. For instance, Twitter includes a security section about how safe user data is, etc. Other platforms break them out in different reports and of course, some of these categories aren’t covered at all.
Ok, let’s dig into each of the major 6 transparency areas:
1. Researcher Access
In July of last year, TikTok announced its first-ever plans to provide researchers with the ability to study parts of their platform. That was a very important step forward but also starting from a completely non-existent researcher access baseline. However, since then, their plans have left most academics disappointed. Dr. Joe Bak-Coleman from Craig Newmark Center for Journalism Ethics and Security at Columbia University lays out some of the challenges, along with Megan Brown, from the New York University Center for Social Media and Politics, one of the leading social media research labs, points out even more problems.
I generally agree with those statements, although deletion frameworks are a fundamentally very complicated problem to solve. I’d also add a few others: the researcher access program doesn’t seem like it would be available to broader parts of civil society that might benefit from the data (and could use it safely). For instance, do you need an academic affiliation to get access? If so, that leaves out a huge swath of civil society and the research community that would benefit from access.
I’d also point out that the API only provides access to public data. That is a far cry from the sort of academic programs that a lot of researchers hope for, and for which there is existing precedent in the space. For instance, both Twitter and Meta have run academic partnerships in the past that specifically provide access to more sensitive datasets but in ways that protect private user data. And in some cases, Meta and Twitter actively partnered with academics to co-design and manage those research projects.
2. Availability of Public Data
TikTok doesn’t provide tools (similar to CrowdTangle) that provide user-friendly interfaces or real-time APIs to look at and study public data. While their new researcher API is entirely based on public data, it has a lot of limitations, including no front-end interface. When it comes to whether or not anyone can automatically collect public data for research purposes, I’m not an expert on terms of service and what it means for scraping but my non-lawyerly interpretation is that they do preserve the right to go after anyone accessing or automatically collecting any data from their service, even if it’s in the public interest. That being said, I wasn’t able to find any known instances of TikTok going after anyone for scraping in the public interest or adversarially trying to make it hard to scrape in the public interest. Would be curious if others wanted to jump in.
3. Algorithmic Transparency
As far as I can tell, TikTok hasn’t released anything public about its algorithm beyond what’s available to visitors to their Transparency Center. If you are invited to visit, they apparently offer a “code simulator” meant to inform visitors about how the algorithm works but that is a far cry from offering any sort of actual details or technical insights that could be evaluated, audited, or studied by experts. It also falls well short of what has been made public through some leaked documents. To their credit, there is some user transparency around the algorithm, specifically that users have the ability to see their watch history, as well as delete it in order for the algorithm to relearn user preferences. But again, a far cry from any detailed explanation of how the algorithm works, regularly updating the public when there are meaningful changes, giving auditors the ability to inspect it, etc.
4. Transparency Reports
TikTok releases quarterly transparency reports around their content moderation (mostly their removal activities) and has for several years. They have a user-friendly front-end interface for their reports, as well as downloadable data. They also have data available about a fairly diverse set of policy violations. When it comes to transparency around covert information operations, they provide both countries targeted and attribution, along with some qualitative description of their findings.
However, where the reports fall well short is not disclosing any numbers about the reach and impact of any of the harmful content, an important and missing metric across almost all the reports in the industry at the moment. For instance, for their covert information section, they don’t include how many people saw the content or the total number of impressions. At the very least, they should include prevalence, a metric that has its own significant limitations but is used by Meta across its properties and is an important component of all their reports. In addition to not having any distribution metrics, the report is limited to a single high-level set of global numbers (no breakdown by country), doesn’t break out any vulnerable communities, has very little qualitative information, doesn’t include any information about the resources they have dedicated to the work, and more.
This is also a great resource for everything you might want to know about comparing transparency reports across platforms and how they’ve evolved over time, including TikTok’s, but also where they all fall short.
5. Advertising Transparency
TikTok theoretically has an “Ads Library” but it is one that is entirely designed for marketers and is meant to encourage best practices among ad buyers, not to promote any sort of transparency in political or social ads or more. Moreover, as a user, besides seeing that a video might be sponsored, I can’t see any additional information about the ad, who was targeted, why I saw it, or anything else within the app itself. In this case, TikTok is literally years behind the voluntary industry standards for advertising libraries that have been set up by Google, Snapchat, and Meta (albeit poorly in a lot of cases) and will likely soon be afoul of regulations coming into effect in Europe. The only defense of TikTok’s transparency, in this case, and a very reasonable one is that they don’t accept any political advertising and most of the original ad libraries were specifically designed to address the issue of political ad transparency; however, the trend in advertising regulation is to try and make all advertising more transparent and so they can’t be let off the hook unfortunately.
6. User-Level Transparency
If you have a video that is removed or an account that is deleted, there is a public mechanism to appeal and reports seem to indicate that TikTok will likely respond (more so than other platforms but there are no rigorous studies verifying this). However, if you have any sort of penalty or demotion on your account or on a specific comment you’ve made, you are not automatically notified nor is there necessarily any public record of that instance of moderation (for the user or the people who might have otherwise consumed the content). The community standards are all available online; however, not in every language. And again, to their credit, see some of the above comments about users being able to see and edit their watch history.
To learn more about some of the potential ways to lean into user transparency, you can read the Santa Clara Principles.
7. Extra Credit
TikTok does have physical locations for Transparency Centers around the world, including one they recently opened in Los Angeles, where visitors can experience what it’s like to engage in actual content moderation; smart people who visited seemed to feel like they learned some new things, which is something. But, among lots of other limitations, it’s not clear who is even allowed to visit. They also engaged Oracle for some sort of algorithmic auditing but I struggled to learn more about any of the impact or results from those audits. I assume the results are only available to appropriate regulators & federanl agencies here in the U.S. but again, there’s just not much information about them.
In total, I give TikTok a total score of 22 out of 70, or roughly 30%.
At this point, it’s hard to say that TikTok is genuinely transparent in almost any category. They’re either behind industry standards or in multiple categories, and they’ve made little to no progress.
The good news is that it’s clear that TikTok is investing in the space and the pressure is on (or maybe too late). They’ve also made some unique contributions and more importantly, they have absolutely gotten more transparent in the last year (both in reality and in their commitments to new projects). It’s also true that when it comes to political and electoral content, TikTok doesn’t yield nearly the same direct influence that Facebook, Instagram, and Twitter do and perhaps some of the responsibilities that Facebook might have in political and civic areas aren’t as urgent for TikTok. However, it’s also entirely possible that politics is coming to TikTok, whether they want it or not. For instance, Democrats are increasingly finding it a powerful place to reach young people.
Given its size, popularity among young people, ability to influence culture (see Colleen Hoover), and the regulatory scrutiny they are under, their transparency efforts are even more disappointing relative to the responsibilities they have.
If TikTok genuinely wants to commit to being more transparent, below is a set of recommendations that could earn them a much higher score, bring them much more in line with industry standards and even likely put them on solid footing with existing and coming transparency regulations from around the world. And if they truly wanted to offer an “unprecedented levels of transparency”, they should begin thinking about how to tackle most of these:
Build an Ad Library
Build a CrowdTangle equivalent for high-reach public content that is broadly accessible (this should include both a user-friendly interface and APIs)
Add more scale, cause, and nature of exposure to harmful content to their reports
Commit to external auditing of its reports & datasets
Commit to external auditing of any significant algorithms
Create a research data program that allows for access to privacy-protecting sensitive datasets, as well as the opportunity for co-designed research
Create an archive for content and accounts removed for violating information operations rules and make it available to approved researchers
Build more user-facing transparency about why people are seeing ads
Allow users to see if their accounts have been penalized for any violations
Allow users to see if any other accounts have penalties for violations
Leave “tombstones” to make it clear anytime a comment has been removed by the platform or by an account
Make it more clear to users what content is public and what is private
As I mentioned, this topic is a complicated and big one. If you notice that I got anything wrong, if you disagree with any of my subjective analyses or just wanted to add any additional thoughts, please leave comments below or just reach out to me directly at firstname.lastname@example.org. If it turns out that I got anything significantly incorrect, I’ll publish revisiting whatever I missed. Thanks!
There are a lot of great resources to dig into this space more and I’m going to try and write a dedicated post on this at some point but for now:
New America’s comprehensive overview of how transparency reports have changed over time for each platform and how they compare to each other
The Action Coalition on Meaningful Transparency, based in Europe, is doing a ton of terrific work on this entire space
Santa Clara Principles on transparency to users around content moderation
Ranking Digital Rights does a lot of terrific work in this space
The Integrity Institute has a great deck that goes through some approaches to transparency that are widely agreed on as best practices among their members (although I know they’re also in the process of updating this)
And a lot more
Thanks for reading Some Good Trouble! Subscribe for free to receive new posts and support my work.