The 33rd AAAI Conference on Artificial Intelligence (AAAI-19) is now underway in Hawaii, USA. The program chairs are Pascal Van Hentenryck (Georgia Institute of Technology, USA) and Zhi-Hua Zhou (Nanjing University, China). The annual AAAI conference aims to promote AI research and scientific exchanges among AI researchers, practitioners, scientists, and engineers in affiliated disciplines. The conference has a diverse technical track and includes student abstracts, poster sessions, invited speakers, tutorials, workshops, and exhibit and competition programs.



There were a record high 7,745 total AAAI paper submissions this year; while the conference’s paper acceptance rate also hit a new record of just 16.2 percent. The AAAI-19 Special Awards and Honors, including Outstanding Paper, Honorable Mention, Outstanding Student Paper Award, Blue Sky Idea, etc. were presented on Tuesday, January 29 by Subbarao Kambhampati, Awards Committee Chair and AAAI Past President; Yolanda Gil, AAAI President; and Bart Selman, AAAI President-Elect.



Outstanding Paper: How to Combine Tree-Search Methods in Reinforcement Learning

Authors: Yonathan Efroni, Gal Dalal, Bruno Scherrer and Shie Mannor

Institution: Technion, Israel, INRIA, Villers les Nancy, France

Paper link: https://arxiv.org/abs/1809.01843

Abstract: Finite-horizon lookahead policies are abundantly used in Reinforcement Learning and demonstrate impressive empirical success. Usually, the lookahead policies are implemented with specific planning methods such as Monte Carlo Tree Search (e.g. in AlphaZero). Referring to the planning problem as tree search, a reasonable practice in these implementations is to back up the value only at the leaves while the information obtained at the root is not leveraged other than for updating the policy. Here, we question the potency of this approach.Namely, the latter procedure is non-contractive in general, and its convergence is not guaranteed. Our proposed enhancement is straightforward and simple: use the return from the optimal tree path to back up the values at the descendants of the root. This leads to a \gamma^h-contracting procedure, where \gamma is the discount factor and h is the tree depth. To establish our results, we first introduce a notion called multiple-step greedy consistency. We then provide convergence rates for two algorithmic instantiations of the above enhancement in the presence of noise injected to both the tree search stage and value estimation stage.

Honorable Mention: Solving Imperfect-Information Games via Discounted Regret Minimization

Authors: Noam Brown, Tuomas Sandholm

Institution: Carnegie Mellon University

Paper link: https://arxiv.org/abs/1809.04040

Abstract: Counterfactual regret minimization (CFR) is a family of iterative algorithms that are the most popular and, in practice, fastest approach to approximately solving large imperfect-information games. In this paper we introduce novel CFR variants that 1) discount regrets from earlier iterations in various ways (in some cases differently for positive and negative regrets), 2) reweight iterations in various ways to obtain the output strategies, 3) use a non-standard regret minimizer and/or 4) leverage “optimistic regret matching”. They lead to dramatically improved performance in many settings. For one, we introduce a variant that outperforms CFR+, the prior state-of-the-art algorithm, in every game tested, including large-scale realistic settings. CFR+ is a formidable benchmark: no other algorithm has been able to outperform it. Finally, we show that, unlike CFR+, many of the important new variants are compatible with modern imperfect-information-game pruning techniques and one is also compatible with sampling in the game tree.

Outstanding Student Paper: Zero Shot Learning for Code Education: Rubric Sampling with Deep Learning Inference

Authors: Mike Wu, Milan Mosse, Noah Goodman and Chris Piech

Institution: Stanford University

Paper link: https://arxiv.org/abs/1809.01357

Abstract: In modern computer science education, massive open online courses (MOOCs) log thousands of hours of data about how students solve coding challenges. Being so rich in data, these platforms have garnered the interest of the machine learning community, with many new algorithms attempting to autonomously provide feedback to help future students learn. But what about those first hundred thousand students? In most educational contexts (i.e. classrooms), assignments do not have enough historical data for supervised learning. In this paper, we introduce a human-in-the-loop “rubric sampling” approach to tackle the “zero shot” feedback challenge. We are able to provide autonomous feedback for the first students working on an introductory programming assignment with accuracy that substantially outperforms data-hungry algorithms and approaches human level fidelity. Rubric sampling requires minimal teacher effort, can associate feedback with specific parts of a student’s solution and can articulate a student’s misconceptions in the language of the instructor. Deep learning inference enables rubric sampling to further improve as more assignment specific student data is acquired. We demonstrate our results on a novel dataset from Code.org, the world’s largest programming education platform.



Outstanding Student Paper Honorary Mention: Learning to Teach in Cooperative Multiagent Reinforcement Learning

Authors: Shayegan Omidshafiei, Dong-Ki Kim, Miao Liu, Gerald Tesauro, Matthew Riemer, Christopher Amato, Murray Campbell and Jonathan P. How

Institution: LIDS, MIT, MIT-IBM Watson AI Lab, IBM Research, CCIS, Northeastern University

Paper link: https://arxiv.org/abs/1805.07830

Abstract: Collective human knowledge has clearly benefited from the fact that innovations by individuals are taught to others through communication. Similar to human social groups, agents in distributed learning systems would likely benefit from communication to share knowledge and teach skills. The problem of teaching to improve agent learning has been investigated by prior works, but these approaches make assumptions that prevent application of teaching to general multiagent problems, or require domain expertise for problems they can apply to. This learning to teach problem has inherent complexities related to measuring long-term impacts of teaching that compound the standard multiagent coordination challenges. In contrast to existing works, this paper presents the first general framework and algorithm for intelligent agents to learn to teach in a multiagent environment. Our algorithm, Learning to Coordinate and Teach Reinforcement (LeCTR), addresses peer-to-peer teaching in cooperative multiagent reinforcement learning. Each agent in our approach learns both when and what to advise, then uses the received advice to improve local learning. Importantly, these roles are not fixed; these agents learn to assume the role of student and/or teacher at the appropriate moments, requesting and providing advice in order to improve teamwide performance and learning. Empirical comparisons against state-of-the-art teaching methods show that our teaching agents not only learn significantly faster, but also learn to coordinate in tasks where existing methods fail.



Classic Paper: Content-Boosted Collaborative Filtering for Improved Recommendations

Authors: Prem Melville、Raymond J. Mooney 和 Ramadass Nagarajan

Institution: University of Texas

Paper link: https://www.cs.utexas.edu/~ml/papers/cbcf-aaai-02.pdf

Abstract: Most recommender systems use Collaborative Filtering or Content-based methods to predict new items of interest for a user. While both methods have their own advantages, individually they fail to provide good recommendations in many situations. Incorporating components from both methods, a hybrid recommender system can overcome these shortcomings. In this paper, we present an elegant and effective framework for combining content and collaboration. Our approach uses a content-based predictor to enhance existing user data and then provides personalized suggestions through collaborative filtering. We present experimental results that show how this approach, Content-Boosted Collaborative Filtering, performs better than a pure content-based predictor, pure collaborative filter, and a naive hybrid approach.

Feigenbaum Prize 2019

The AAAI Feigenbaum Prize is awarded biennially to recognize and encourage outstanding Artificial Intelligence research advances made by using experimental methods of computer science. The 2019 award was presented to Stuart Russell of the University of California, Berkeley for his innovation and achievement in probabilistic knowledge representation, reasoning and learning.



Blue Sky Idea 2019

AAAI collaborated with the Computer Research Association Computational Community Association (CCC) to nominate three submitted papers for the Blue Sky Award. The honoured papers represent ideas and visions that can bring innovations to new issues, application areas and research directions:

First Place：Explainable, Normative, and Justified Agency（Pat Langley）

Second Place：Building Ethically Bounded AI（Francesca Rossi、Nicholas Mattei）

Third Place ：Recommender Systems: A Healthy Obsession（Barry Smyth）

Outstanding Senior Program Committee Awards

Senior Member status is bestowed on AAAI members who have achieved significant accomplishments within the field of artificial intelligence. Six members received the award this year:



Outstanding Senior Program Committee Awards:

Xiang Bai (Huazhong University of Science and Technology, China)

Hendrik Blockeel (KU Leuven, Belgium)

Zico Kolter (Carnegie Mellon University, USA)

Michele Lombardi (Università di Bologna, Italy)

Aditya Menon (Google Research, USA)

Steven Schockaert (Cardiff University, UK)

The Thirty-Third AAAI Conference on Artificial Intelligence also encompasses the Conference on Innovative Applications of Artificial Intelligence, and the Ninth Symposium on Educational Advances in Artificial Intelligence. The conference continues to February 1 at the Hilton Hawaiian Village in Honolulu.