New York Times Takes Multiple Steps to Authenticate Videos

The New York Times, which is now posting information explaining its journalistic practices, recently described how it reviews news-relevant videos from a wide variety of sources, including news agencies; social media sites such as Twitter, Facebook, YouTube and Snapchat; and eyewitness videos via WhatsApp, witness contacts on the ground or “joining relevant groups.” The actual verification process is broken down into two steps. First, it determines whether a video is “really new.” The second step is to “dissect every frame to draw conclusions about location, date and time, the actors involved and what exactly happened.” 

The first step addresses “a major challenge for journalists today,” which is to avoid using “recycled content,” a problem that is “exacerbated in breaking news situations,” notes NYT. In those situations, misattributed video and video depicting old events are common.

“The more dramatic the footage, the more careful we have to be … [to] establish the provenance of each video — who filmed it and why — and ask for permission to use it.” The “perfect scenario” would mean “obtaining the original video file or finding the first version of the video shared online, vetting the uploader’s digital footprint and contacting the person — if it’s safe to do so.”

The next step is to “extract as much detail as possible,” although situations such as “armed conflict and severe state repression often make it challenging to connect with sources, for logistical or security reasons.” But NYT has “developed skills and methodologies to independently confirm or corroborate what’s visible in a video.”

Video from wire services such as The Associated Press or a source’s cellphone is much easier, since it comes with “intact metadata.” Metadata from social media sites and messaging apps, however, is often altered or removed, so NYT has to “look for visual clues regarding location and date in the video itself,” such as landmarks or geographic features that can be “matched up with reference materials such as satellite images, street views and geotagged photographs.”

That can be very challenging, and “in the most challenging situations, we might also call on the public to help.” NYT also pays attention to the audio, since “local dialects might help corroborate the general location.” Pinning down the exact date and time, however, can be even more difficult, relying on “historical weather data to detect inconsistencies in a video.”

To determine specific time of day, it can analyze shadows using a tool called SunCalc. “Finally, we take a close look at what else is visible in a video to draw conclusions about the event and the actors involved,” extracting such details as “official insignia or military equipment” or “doing a frame-by-frame analysis of multiple videos.”

NYT “also recently started researching how the emerging issue of deep fakes, or media generated with the help of artificial intelligence, will impact our newsroom,” knowing that, because the images contain computer generated imagery, “the verification process described above will not suffice.” “Instead, we have to build up the technical capacity to win the coming artificial intelligence-powered ‘arms race between video forgers and digital sleuths’.”