Wednesday, May 9, 2018

News Credibility, Verification and the Madness of Crowds - A Junk News Roundup

As the Associated Press states in our News Values and Principles:

"We have a long-standing role setting the industry standard for ethics in journalism. It is our job — more than ever before — to report the news accurately and honestly."

It is easy to see how AP is taking concrete steps in this area by, for example, our fact checking work (online, on twitter). And the AP Verify project is building a "newsroom tool that will combine artificial intelligence with our editorial expertise to automatically source and verify user-generated content."

I thought it would be interesting to take a look at some efforts going on elsewhere in the areas of credibility, verification and identifying junk news.

Standards Efforts

The IEEE is working on P7011 "Standard for the Process of Identifying and Rating the Trustworthiness of News Sources". The IEEE is a formal standards body, responsible for many of the technical standards which underpin the internet.

The Credibility Coalition describes itself as "an interdisciplinary community committed to improving our information ecosystems and media literacy through transparent and collaborative exploration." It is not, in itself, a standards body. If you examine the CredCo "about" page, you will spot my photo - I attended early meetings.

The Credible Web W3C Community Group describes its mission as "to help shift the Web toward more trustworthy content without increasing censorship or social division." There is a significant overlap between members of the Credibility Coalition and the Credible Web Community Group.  Despite the W3C link, this is not a formal standards effort - Community Groups are open to anyone. There are weekly video conferences to define an informal standard.

The Trust Project describes itself as "a consortium of top news companies" and says it "is developing transparency standards that help you easily assess the quality and credibility of journalism." Again, the Trust Project is not a formal standards body (like IEEE, IPTC or W3C).

Verification Projects

At the recent IPTC meeting, we saw presentations about two European projects aimed at helped to identify the spread of misinformation.

Truly Media is a joint project between ATC and Deutsche Welle. It is a "a web-based collaboration platform developed to support primarily journalists and human rights workers in the verification of digital content," and was developed with funds from EU and the DNI.

InVid aims to develop "a knowledge verification platform to detect emerging stories and assess the reliability of newsworthy video files and content spread via social media." It is an EU-funded project. Their demo was quite sophisticated. They also have a browser plugin which lets you verify news video and images yourself.

Wisdom and Madness

Finally, via Fair Warning, I saw "The Wisdom and Madness of Crowds" - a fun explainer in the form of a game. It walks you through why some crowds turn to madness and some to wisdom, with a focus on the spread of misinformation but also good information. It helps give some insight into the different dynamics at play and even some suggestions for how to reduce the spread of junk news and amplify the spread of verified news.

Monday, April 30, 2018

IPTC Names Brendan Quinn as New Managing Director, Celebrates the Service of Michael Steidl

As well as being Director of Information Management at the AP, I'm also Chairman of the Board of the IPTC, the standards body for the international news and media industry. The IPTC sets technology standards used around the world, including photo metadatavideo metadataand machine readable rights. We also develop technical approaches to the challenges facing the news media, such as identifying junk news, leveraging automation and coping with the impact of GDPR. Together with the other members of the IPTC - including ReutersAFPDPA and the New York Times - I help organize face-to-face meetings and numerous teleconferences so that we can work together and learn about interesting new projects from vendors and academics.
The Managing Director is the sole employee of the IPTC, helping to organize the work, manage the finances and recruit new membership. For the last 15 years, Michael Steidl has held this role. When Michael announced his plan to retire in the summer of 2018, I organized and ran the effort to find and recruit a successor. We talked to many candidates, several of whom were highly qualified. The Board did not take the decision lightly. In the end, we made an offer to Brendan Quinn - and we are thrilled he has accepted. Brendan brings with him a wealth of news technology experience, with organizations from around the world and of all sizes. He even worked at the Associated Press on AP's Video Hub. He has a unique combination of strategic insight into the challenges faced by the news industry and the technical know-how to help guide our work in technical standards and beyond. I look forward to partnering with Brendan in charting the future of the organization and to grow the work and influence of the IPTC.

Celebrating Michael Steidl's 15 Years as IPTC Managing Director

At the IPTC's Spring 2018 Meeting in Athens, we covered many interesting technical news topics - including video metadata, machine-readable rights, news credibility, and the challenges of localization and localisation. We also welcomed our incoming Managing Director, Brendan Quinn, and took some time to celebrate our retiring Managing Director, Michael Steidl.

As Chairman of the IPTC, I was honoured to give a speech, marking Michael's achievements over the last 15 years of service to the organization.

I would like to take a few minutes to celebrate Michael Steidl, our IPTC Managing Director for the past 15 years. Looking back over that time, many things stand out. I think we would all agree that Michael has a comprehensive understanding of all aspects of the IPTC - the standards, the history and - last but not least - the often obscure facets of the rules and regulations. Many times, an enthusiastic IPTC person has suggested some change or new idea, only to be gently reminded, "Well, I remember a discussion in 2008 where we voted on that topic and we decided..." I know that we will all miss Michael's kind but firm insistence that the rules and history of the IPTC be respected. And, in many ways, he acts as the representative of all the member organizations, whether or not they happen to be present in a particular discussion, to ensure that, for example, the large organizations don't dominate at the expense of the smaller organizations.

To give some perspective on Michael's achievements, I thought it would be interesting to look back on what Michael himself said about his role and the work of the IPTC. In December of 2003, the IPTC Spectrum - the old newsletter we used to publish - included Michael's reflections on his first year as Managing Director. I recommend you read the whole thing yourself. But now I'd like to focus on three things he said.

First, Michael described visiting the Van Gogh museum in Amsterdam and drew an analogy between the artist's evolving use of colour and his own intention in how he would work within the IPTC. He said "This might be a metaphor for the big task I jumped into: not to reinvent the wheel of running the IPTC office, well developed, maintained, and handed over by David Allen, but to add some extra shades of colour to IPTC’s image as a major player for standards in the news industry." Of course, looking back over the past 15 years of work, it is clear that on the one hand Michael did succeed in taking over the reins from his predecessor. But, on the other hand, he has done a lot more than simply adding some extra shades of colour. In fact, I would say that Michael's contributions to the IPTC is really more equivalent to an entirely new artistic movement - a sort of Renaissance for the organization - including managing the introduction of entirely new ways of operating the IPTC. When Michael started, there were no teleconferences or video conferences or even development of standards through email lists. There was no internet available during the meetings - which has perhaps been a mixed blessing, since people can keep up with the work back home, but we aren't always as focused.

There were already some hints of these changes in Michael's remarks. The second of the three quotes I want to pick out from the 2003 Spectrum:

We want to "discuss new ways of developing and maintaining our standards. These appear quite necessary to me: in the past decades IPTC usually developed and maintained one to two standards in parallel; IPTC 7901 was succeeded by IIM; and this was followed by NITF over a period of almost 15 years. But now three standards - NITF, NewsML, SportsML - have been developed and approved in a time span of about eight years. These three standards are all currently active and an additional three are under development ProgramGuideML, EventsML and an upcoming weather mark up. So soon we will have six active standards." So, three standards were developed in 15 years. Then three more in eight years. So, the IPTC work was already accelerating. But, just as a reminder, at this meeting in Athens, we discussed six different standards and will vote on three major updates. On top of that, we've discussed ten or more additional work areas - including the VideoDextra initiative, the EXTRA project and everyone's favourite topic of GDPR. Some people might think that standards take a long time to develop - and they do! - but we're no longer producing three standards in 15 years. or even only discussing technical standards anymore.

Now, 15 years later, Michael knows every detail of a full range of standards and an impressive array of initiatives. As was mentioned earlier, Michael has developed an extensive records keeping scheme of - I believe it was - 5,000 file folders stuffed with IPTC information. Now, with Brendan Quinn coming on board as our new Managing Director this summer, it must seem like a daunting task to succeed such an accomplished Managing Director.

So, I want to come to the third Spectrum quote from Michael from back in 2003, to reassure Brendan that Michael was once in the same boat. Michael said: "Yes, I had to learn the ropes first. IPTC operations are complex and it’s like conquering an unknown island: region after region had to be explored and all details of operation had to be made transparent, for me and to others. Preparing and providing the required resources for a meeting, taking minutes that reproduce the key points of the discussions, handling the finances, and last but not least supporting and co-ordinating the technical work of IPTC was occasionally really breathtaking and I have to admit it was a steep learning curve." So, Brendan, don't worry it wasn't easy for Michael either, but it can be done!

Finally, I want to close with an entirely different aspect of Michael's time with the IPTC. I've talked a lot about Michael's work. And, of course, solving news technology problems is the main reason for IPTC's existence. However, Michael has always pointed out in his polite, gentle but firm way that there is more to it than that. The IPTC is also an organization made up of people. It is a unique mix of people who often come from rival organizations and quite different backgrounds, who are able to come together and learn from each other and co-operate to solve problems together. And, in that process, it is often the case that rivals can become colleagues and colleagues can become friends. Michael, many of the people here - and many others around the world - count you as a friend. And so, along with your many work achievements with the IPTC, you should be very proud of all the colleagues and friends you have made.

And now, I'd like to ask all of your colleagues and friends here to join me in raising a glass, thanking you and wishing you a very happy retirement. THANK YOU MICHAEL!

Friday, January 19, 2018

Developing the Digital Marketplace for Copyrighted Works Meeting in Washington DC on January 25th

I'm looking forward to the "Developing the Digital Marketplace for Copyrighted Works" meeting in Washington DC on January 25th. This is the latest in a series of meetings organized by the U.S. Department of Commerce's Internet Policy Task Force. The stated goal is to "facilitate constructive, cross-industry dialogue among stakeholders about ways to promote a more robust and collaborative online marketplace for copyrighted works".

Developing the digital marketplace for copyrighted works

I will be speaking on the "identification" panel ("Capturing Content, People, Permissions"). I attended the previous meeting (see my short summary and watch the videos). If a day of rights discussions sounds like your idea of fun, then this is a free, open-to-the-public meeting (sign up). You can also watch the live stream.

Monday, November 20, 2017

The View from Barcelona - IPTC AGM 2017

I Chair the Board of Directors of IPTC, a consortium of news agencies, publishers and system vendors, which develops and maintains technical standards for news, including NewsML-G2, rNews and News-in-JSON. I work with the Board to broaden adoption of IPTC standards, to maximize information sharing between members and to organize successful face-to-face meetings.

We hold face-to-face meetings in several locations throughout the year, although, most of the detailed work of the IPTC is now conducted via teleconferences and email discussions. Our Annual General Meeting for 2017 was held in Barcelona in November. As well as being the time for formal votes and elections, the AGM is a chance for the IPTC to look back over the last year and to look ahead about what is in store. What follows are a slightly edited version of my remarks at the Barcelona AGM.
IPTC has had a good year - the 52nd year for the organization!
We've updated our veteran standards, Photo metadata - our most widely-used standard - and NewsML-G2 - our most comprehensive XML standard, marking its 10th year of development.
We're continuing to work in partnership with other organizations, to maximize the reach and benefits of our work for the news and media industry. In coordination with CEPIC we organized the 10th annual Photo Metadata Conference, looking to the future of auto tagging and search, examining advanced AI techniques - and considering both their benefits and their drawbacks for publishers. With the W3C we have crafted the ODRL rights standard and are launching plans to create RightsML as the official profile of the ODRL standard, endorsed by both the IPTC and W3C.
We've also tackled problems that matter to the media industry with technology solutions which are founded on standards, but go beyond them. The Video Metadata Hub is a comprehensive solution for video metadata management that allows exchange of metadata over multiple existing standards. The EXTRA engine is a Google DNI sponsored project to create an open source rules based classification engine for news.
We've had some changes in the make-up of IPTC. Johan Lindgren of TT joined the Board. Bill Kasdorf has taken over as the PR Chair. And we were thrilled to add Adobe as a voting member of IPTC, after many years of working together on photo metadata standards. Of course, with more mixed emotions, we have also learnt that Michael Steidl, the IPTC Managing Director, for 15 years will retire next Summer. As has been clear throughout this meeting and, indeed, every day between the meetings on numerous emails and phone calls, Michael is the backbone of the work of the IPTC. Once again, I ask you to join me in acknowledging the amazing contributions and dedications that Michael displays towards the IPTC.
Later today, we will discuss in detail our plans to recruit a successor for the crucial role of the Managing Director. And this is not the only challenge that the IPTC faces. We describe ourselves as "the global standards body of the news media" and that "we provide the technical foundation for the news ecosystem". As such, just as the wider news industry is facing a challenging business and technical environment, so is the IPTC.
During this meeting, we've talked about some of the technical challenges - including the continuing evolution of file formats and supporting technologies, whilst many of us are still working to adopt the technologies from 5 or 10 year ago. We've also talked about the erosion of trust in media organizations and whether a combination of editorial and technical solutions can help.
But I thought I would focus on a particular shift in the business and technical environment for news that may well have a bigger impact than all of those. That shift can be traced back to 2014 which, by coincidence, is when I became Chairman of the IPTC. Last week, Andre Staltz published an interesting and detailed article called "The Web Began Dying in 2014, Here's How". If you haven't read it, I recommend it. The article makes a number of interesting points and backs them up with numerous charts and statistics. I will not attempt to summarize the whole thing, but a few key points are worth highlighting.
Staltz points out that, prior to 2014, Google and Facebook accounted for less than 50% of all of the traffic to news publisher websites. Now those two companies alone account for over 75% of referral traffic. Also, through various acquisitions, Google and Facebook properties now share the top ten websites with news publishers - in the USA 6 of the 10 most popular websites are media properties. In Brazil it is also 6 out of 10. In the UK it is 5 out of 10. The rest all belong to Facebook and Google.
Both Facebook and Google reorganized themselves in 2014, to better focus on their core strengths. In 2014, Facebook bought Whastapp and terminated its search relationship with Bing, effectively relinquishing search to Google and doubling down on social. Also in 2014, Google bought DeepMind and shutdown Orkut, its most successful social product. This, along with the reorganization into Alphabet, meant that Google relinquished social to Facebook and allowing it to focus on search and - even more - artificial intelligence. Thus, each company seems happy to dominate their own massive parts of the web.
But ... does that matter to media companies? Well, Facebook said if you want optimal performance on our website, you must adopt Instant Articles. Meanwhile, Google requires publishers to use its Accelerated Mobile Pages or "AMP" format for better performance on mobile devices. And, worldwide, Internet traffic is shifting from the desktop to mobile devices.
Then, if you add in Amazon, Apple and Microsoft, it is clear that another huge shift is going on. All of the Frightful Five are turning away from the Web as a source of growth and instead turning to building brand loyalty via high end devices. Following the successful strategy of Apple, they are all becoming hardware manufacturers with walled gardens. Already we have Siri, Cortana, Alexa and Google Home. But also think about the investments going on by these companies in AR and VR as ways to dominate social interactions, e-commerce and machine learning over the Internet.
So, just as news companies must confront these shifts in the global business and technology environment, so must the IPTC. During this meeting, we've talked about our initial efforts to grapple with metadata for AR, VR and 360 degree imagery. We've also discussed techniques which are relevant to news taxonomy and classification, including machine learning and artificial intelligence. At the same time, Facebook, Google and others are not totally in control, as they - along with Twitter - found themselves having to explain the spread of disinformation on their platforms and under increased government scrutiny, particular in the EU. So, all of us, whether we describe ourselves as news publishers or not, are dealing with a rapidly changing and turbulent information, technical and business environment.
What does this mean for IPTC? IPTC is a news technology standards organization. But it is also unique in that we are composed of news companies from around the world. We know from the membership survey that both of these factors - influence over technical solutions and access to technology peers from competitors, partners, diverse organizations large and small - are very important to current members. In order to prosper as an organization, IPTC needs to preserve these unique benefits to members, but also scale them up. This means that we need to find ways to open up the organization in ways that preserve the value of the IPTC and fit with the mission, but also in ways that are more flexible. We need to continue to move beyond saying that the only thing we work on is standards and instead use standards as a component of the technical solutions we develop, as we are doing with EXTRA and the Video Metadata Hub. We need to work with diverse groups focused on solving specific business and journalistic problems - such as trust in the media - and in helping news companies learn the best ways to work with emerging technologies, whether it is voice assistants, artificial intelligence or virtual reality.
I'm confident that - working together - we can continue to reshape the IPTC to better meet the needs of the membership and to move us further forward in support of solving the business and editorial needs of the news and media industry. I look forward to working with all of you on addressing the challenges in 2018 and beyond.
Thank you.

Wednesday, August 30, 2017

Serverless Tip: Use the "artifact" Directive to Deploy Your Pre-Built Lambda Zip File

tl;dr: you can deploy pre-built zip files (e.g. for your Python Lambda) using the "artifact" directive in the serverless framework.

AWS Lambda is Great!

I've been doing a lot of work recently with AWS Lambda. And I'm a fan. The combination of API Gateway + Lambda + Python, together with other AWS services including DynamoDB and S3, not to mention the awesome array of Python open source libraries, means I'm churning out all sorts of microservices with glee.

The serverless paradigm is quite different than the traditional (serverfull?) paradigm. As well as adjusting the architectural style to take advantage of what Lambda offers, deploying the code and all of its dependencies is quite different. After looking at some alternatives, we concluded that the Serverless Framework best fit our requirements.

The Serverless Framework is Great!

Rather than crafting complex CloudFormation configurations to manage my microservices in AWS, I use the Serverless Framework. (The framework also works with Apache OpenWhisk, Microsoft Azure and Google Cloud). Essentially Serverless is a simpler CloudFormation, specific to Lambda-centric deployments. (To be clear, it doesn't just help with deployments of AWS Lambda - Serverless covers a wide and growing range of AWS services).

Mainly by studying (*cough* copy-n-pasting *cough*) the extensive range of examples, and sometimes resorting to actually reading the manual, I've been able to get even quite complex setups to work, with fairly simple YAML configuration files. So, I recommend Serverless. (Although AWS themselves are developing an eerily similar alternative, in SAM, which you may also want to check out).

AWS Lambda has Limits

Sometimes you need to do stuff outside the Serverless Framework, but you still want to use all the other cool stuff it does for you.

For example, AWS Lambda has certain limits. This includes a 50Mbyte deployment limit per Lambda. Now, Serverless does let you control what goes into the Lambda package via the "include" and "exclude" directives, within the "package" directive. But, sometimes, you're sailing very close to the 50Mb limit and the only way to stay underneath is to directly create your zip package yourself. Or, in my case, you have a zip file which has precisely what you need, but you also need to manipulate it to add in a pickled bit of code. (Which you do via the Python zipfile library).

It took me a while to figure out, but you can use the "artifact" directive as the way to deploy a zip you've packaged already.

So, there you have it: Lambda is great, but you should use Serverless (or something like it) to simplify your deployments.  And you can deploy pre-built zip files using the "artifact" directive.

Tuesday, August 29, 2017

Emoji, Fake News and 99% Invisible

This morning, I was listening to 99% Invisible, the podcast all about architecture and design.
thinking face
This episode "Person in Lotus Position" was about the process of adding a new emoji to the official set. At one point, they spoke to Jennifer 8. Lee who is on the Unicode Emoji Subcommittee. I know Jenny through Misinfocon. This is a new effort to fight the spread of disinformation on the web via a Knight-funded Credibility Schema Working Group. The goal is to create ways which signal whether a given piece of information on the web is credible.
Most of the podcast episode describes the workings of the Unicode committee, which is official standards body for deciding which characters computers and phones will recognize and exchange. It gave a pretty good introduction to the importance and difficulty of this kind of standards work. (As well as being involved in the Credibility Schema Working Group, I'm also the Chairman of the IPTC, the news technology standards body. So, I like to think I have some insight into how these things work).
If you, like me, are interested in emoji and/or the workings of technical standards groups, then I recommend the episode. (Also, if you're interested in stopping the spread of fake news or in promoting technical standards within the global news industry, feel free to get in touch).