Should the Government Dabble into AI Researchers’ Preference to Publish Openly?

In academia, researchers have always preferred to openly publish cutting-edge research. This is because publishing in high-impact journals is a prerequisite to secure financial grants by faculties and advanced degrees by graduate students. Lately, research labs in many companies, especially in the field of Artificial Intelligence (AI), have started to follow the same trend. This is because many faculty members now migrate to these corporate research labs or hold dual appointments, one in academia and another in industry. In the second case, these researchers have a higher tendency of continuing to openly publish their cutting edge research. These researchers often transition from academia to industry due to the lure of accessing very large-scale datasets—a prerequisite for training the popular machine learning family of AI algorithms—which are chiefly possessed by companies. However, datasets, algorithms, and software by themselves are “fluid” commodities that know no national boundaries. Traditionally, governments have tried to control the best research and its access to the wider world but today many researchers in industry are openly publishing such research. In this article, three factors that should be taken into consideration to fashion policy at the intersection of researchers’ preference to publish openly and the government’s endeavor to control the dissemination of the most cutting-edge AI research are briefly discussed.

In 2016, the Obama White House Administration released a report [1] on the future of AI. This report contained several papers aimed at taking the United States toward a more holistic approach to AI. Recommendation 21 in this report suggested that the U.S. government “should deepen its engagement with key international stakeholders, including foreign governments, international organizations, industry, academia, and others, to exchange information and facilitate collaboration of AI R&D.” However, one critical roadblock that hampers effective collaboration between the government and researchers is the preference of the former to control the most cutting-edge AI research and that of the latter to disseminate it openly. Worryingly, many of the recommendations of the above report are being followed closely by China rather than the U.S. because of the former’s ability to enforce policies more strictly from top to bottom [2]. Today, not only academic labs but research units in various companies are too actively participating in publishing their research and results openly. The traditional distinction between software and hardware is more complicated when it comes to AI. Advances in AI algorithms require advanced hardware systems to run them but once deployed AI becomes software that can easily be diffused and disseminated for free across national borders. This fact makes it imperative for the government to continually assess if the researchers’ preference to publish their results, data, and software openly may create a threat to national security or may diminish the nation’s competitive edge over other nations with advanced AI research programs. At present, the U.S. government regularly funds a variety of AI research through its agencies like DARPA, NSF, NIH, etc. while also organizing prize competitions like DARPA Grand Challenges. Researchers in various academic institutions receive funding from these government agencies and publish cutting-edge AI research undertaken by them. Many researchers transition from academia to industry where they continue to publish their results and in this age of open source platforms, often companies themselves disseminate large-scale datasets and software. Publishing research is not a new thing but what is different now is how fast the research gets disseminated and can be misused to nefarious purposes. Subsequently, the government today finds itself in an increasingly uncomfortable position where it is just reacting to the events and is not in control of them should anything go southward. In this position, the government must design policies that do not interfere with the researchers’ preference to publish while at the same time serve the nation’s interests by guarding specific types of AI research. Three factors that may contribute toward such policy-making in this regard are:

Identification of sensitive AI areas The government’s funding agencies mentioned above currently has a system in place that provides funding to researchers working on sensitive AI areas (nuclear research, military-based robotic applications, etc.) after verifying their citizenship credentials and performing a background check if necessary. But the same system does not strictly extend to the publication of sensitive research as well as its presentation at various international conferences by the researchers. It must be assessed that what licenses could be used to publish such research, and what factors (time duration between obtaining results and publishing them, pre-selecting research journals, etc.) must be taken into consideration while designing a policy to allow researchers to publish research in these key AI areas.
Evaluating the free-flow of data In May 2017, The Economist published a cover story [3] arguing that data has replaced oil as “The World’s Most Valuable Resource.” However, unlike oil whose flow can be controlled easily by governments across the world, data is much more volatile. Access to datasets containing millions of images or uploading the software framework for implementing an AI algorithm on the internet is now just a click away. Researchers in the industry, as well as academia, regularly release such data to boost the AI community as well as to garner more research citations. Thus, in addition to identifying the sensitive AI areas as mentioned above, the provisions (cybersecurity, data protection, data encoding, etc.) which are presently being followed must be assessed including their adherence by researchers while releasing such software to the wider world. Additionally, it should also be evaluated if a policy guideline needs to be laid out for publishing certain types of potentially sensitive datasets (satellite images, demography statistics of every house in a district, etc.) by the researchers irrespective of the user license.
Evaluating the free-flow of AI hardware AI algorithms require advanced hardware systems to run fast and are highly computationally intensive. These hardware systems are either programmable Graphical Processing Units (GPUs) or dedicated data centers housing custom-made chips designed by various companies to especially run AI algorithms [4]. However, many of these data centers are located outside the U.S. and advances in chip design are also published openly by researchers. Furthermore, the system of AI chips supply chains is already highly globalized [5]. Thus, if an unwanted entity gets access to such advanced chips that took months or even years in making from design to fabrication, it can use reverse engineering to replicate them within a few weeks. This raises a need to explore if the present policies in this regard are safeguarding the advances in AI hardware and ensuring that the security measures of data centers outside the U.S. are adequate. This policy-making process should also connect with the evaluation of the free-flow of data above since AI is not merely software and thus sensitive chip design data should be stored in the datacenters that have sufficient safeguards to prevent any theft and misuse.

Previous studies have shown that often the attempts made by governments to regulate the free flow of data leads to slower technology adaptation and less innovation [6]. Hence, any policy in this regard must be formulated with the consent of researchers. This is why multiple surveys and interviews should be conducted among researchers in both academia and industry to collect data about their views on the free flow of information and open publishing vs. self-censoring sensitive data and protecting data from theft by utilizing cyber security-based measures. The research from all of the above findings must be compiled into an interim report every three months to continuously track the progress of the project and receive valuable feedback.

References

[1] The Administration’s Report on the Future of Artificial Intelligence, October 12, 2016, p. 35, https://obamawhitehouse.archives.gov/blog/2016/10/12/administrations-report-future-artificial-intelligence.

[2] Gregory Allen and Elsa B. Kania, “China Is Using America’s Own Plan to Dominate the Future of Artificial Intelligence,” Foreign Policy, September 08, 2017, https://foreignpolicy.com/2017/09/08/china-is-using-americas-own-plan-to-dominate-the-future-of-artificial-intelligence/.

[3] “The world’s most valuable resource is no longer oil, but data,” The Economist, May 6, 2017, https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data.

[4] Michael Horowitz, et al. “Strategic competition in an era of artificial intelligence,” Center for New American Security (Washington, DC: Center for New American Security, 2018), 8, 2018, p. 7, https://www.indexinvestor.com/resources/Research-Materials/NatSec/Strategic_Competition_in_Era_of_AI.pdf.

[5] Saif M. Khan, “Maintaining the AI Chip Competitive Advantage of the United States and its Allies,” Center for Security and Emerging Technology, December 2019, https://cset.georgetown.edu/wp-content/uploads/CSET-Maintaining-the-AI-Chip-Competitive-Advantage-of-the-United-States-and-its-Allies-20191206.pdf.

[6] A. R. Miller and Catherine Tucker, “Privacy protection and technology diffusion: The case of electronic medical records,” Management Science 55.7 (2009): 1077-1093, https://pubsonline.informs.org/doi/pdf/10.1287/mnsc.1090.1014.

Should the Government Dabble into AI Researchers’ Preference to Publish Openly?

Join Newsletter