Search This Blog

Tuesday, March 11, 2025

Reward - Reinforcement learning and Turing Award 2024

Every time a duck jumps into the pond, it is looking for its reward – a catch of fat fish !   Life often is all about recognition and rewards !!  while humans see reward as an incentive (so does animals & birds) – would a machine be ever motivated by a reward ??  -   they lack emotions & intrinsic motivation – yet they could respond to rewards by design of algorithmic optimization, mathematical and by probabilistic decision making !!

 


Some interesting details about Turing Award recipients of the year  !!  Recently,  Andrew G. Barto and Richard S. Sutton have been announced  as the recipients of the 2024 ACM A.M. Turing Award for developing the conceptual and algorithmic foundations of reinforcement learning. 

Reinforcement learning (RL) is an interdisciplinary area of machine learning and optimal control concerned with how an intelligent agent should take actions in a dynamic environment in order to maximize a reward signal. Reinforcement learning is one of the three basic machine learning paradigms, alongside supervised learning and unsupervised learning.  

Some queries on what RL lead me to reading :  Imagine a robot learning to navigate a maze: 

1. The robot (agent) starts at the beginning of the maze (environment).

2. The robot tries moving forward (action) and receives a reward if it reaches a milestone or penalty if it hits a wall.

3. The robot updates its policy based on the rewards and penalties, adjusting its navigation strategy.

4. The robot continues to explore and learn, refining its policy to reach the maze's end efficiently. 

The field of artificial intelligence (AI) is generally concerned with constructing agents—that is, entities that perceive and act. More intelligent agents are those that choose better courses of action. Therefore, the notion that some courses of action are better than others is central to AI. Reward—a term borrowed from psychology and neuroscience—denotes a signal provided to an agent related to the quality of its behavior. Reinforcement learning (RL) is the process of learning to behave more successfully given this signal. 

The idea of learning from reward has been familiar to animal trainers for thousands of years. Alan Turing’s 1950 paper “Computing Machinery and Intelligence,” addressed the question “Can machines think?” and proposed an approach to machine learning based on rewards and punishments.

The pinnacle of achievement has been Nobel Prize, but several fields of human cultural and scientific development are not included in the list of Nobel Prizes, because they are neither among the prizes established as part of Alfred Nobel's will nor, in the case of the Nobel Memorial Prize in Economic Sciences, sponsored afterwards by the Nobel Foundation. While the foundation has discouraged (and occasionally taken legal action against) individuals and organizations that have used the Nobel name to refer to prizes not meeting the aforementioned criteria, several prominent individuals and organizations have nonetheless used the label "Nobel Prize of X" to refer to highly prestigious awards in fields of activity not covered by the official Nobel Prizes. 

The ACM A. M. Turing Award is an annual prize given by the Association for Computing Machinery (ACM) for contributions of lasting and major technical importance to computer science. It is generally recognized as the highest distinction in the field of computer science and is often referred to as the "Nobel Prize of Computing".  As of 2025, 79 people have been awarded the prize, with the most recent recipients being Andrew Barto and Richard S. Sutton.   

The award is named after Alan Turing, a British mathematician and reader in mathematics at the University of Manchester. Turing is often credited as being the founder of theoretical computer science and artificial intelligence,  and a key contributor to the Allied cryptanalysis of the Enigma cipher during World War II.  From 2007 to 2013, the award was accompanied by a prize of US$250,000, with financial support provided by Intel and Google. Since 2014, the award has been accompanied by a prize of US$1 million, with financial support provided by Google. 

The first recipient, in 1966, was Alan Perlis. The youngest recipient was Donald Knuth, who won in 1974 at the age of 36.  Only three women have been awarded the prize: Frances Allen (in 2006), Barbara Liskov (in 2008), and Shafi Goldwasser (in 2012). 

This year - ACM has named Andrew G. Barto and Richard S. Sutton as the recipients of the 2024 ACM A.M. Turing Award for developing the conceptual and algorithmic foundations of reinforcement learning. In a series of papers beginning in the 1980s, Barto and Sutton introduced the main ideas, constructed the mathematical foundations, and developed important algorithms for reinforcement learning—one of the most important approaches for creating intelligent systems. 

Andrew Gehret Barto is an American computer scientist, currently Professor Emeritus of computer science at University of Massachusetts Amherst. Barto is best known for his foundational contributions to the field of modern computational reinforcement learning. 

Richard S. Sutton FRS FRSC is a Canadian computer scientist. He is a professor of computing science at the University of Alberta and a research scientist at Keen Technologies.  Sutton is considered one of the founders of modern computational reinforcement learning, having several significant contributions to the field, including temporal difference learning and policy gradient methods. 

Although Barto and Sutton’s algorithms were developed decades ago, major advances in the practical applications of RL came about in the past fifteen years by merging RL with deep learning algorithms.  This led to the technique of deep reinforcement learning.  “In a 1947 lecture, Alan Turing stated ‘What we want is a machine that can learn from experience,’” noted Jeff Dean, Senior Vice President, Google. “Reinforcement learning, as pioneered by Barto and Sutton, directly answers Turing’s challenge. 

Interesting !  - and very tough for commoners like me to understand. For sure look forward to your response on this post !!!

 
Regards – S Sampathkumar
11.3.2025 

No comments:

Post a Comment