Lifelong Learning - learn as "humans do"
a.k.a. Continuous Learning or Continual Learning
Accumulate knowledge learned in the past and
use it to learn more knowledge to build a lifelong learning machine
"Lifelong Machine Learning."
by Z. Chen and B. Liu, Morgan & Claypool Publishers, November 2016.
Lifelong Machine Learning Tutorial. Title: lifelong machine learning and computer reading the Web, KDD-2016, August 13-17, 2016, San Francisco, USA.
Lifelong Machine Learning Tutorial, IJCAI-2015, July 25-31, 2015, Buenos Aires, Argentina.
A Podcast: "Machines that Learn Like Humans" by my former student Zhiyuan Chen and Francesco Gadaleta (host).
Statistical learning algorithms like deep NN, SVM, HMM, CRF, and topic
modeling have been very successful in machine learning and data mining
applications. Given a dataset, such an algorithm simply runs on the
dataset to produce a model without considering any
related information or past learning results. Although these algorithms
can still be improved, such single task and isolated algorithmic
approaches to machine learning have their limits, e.g., it needs a large
number of training examples and is only suitable for well-defined and
narrow tasks. Looking ahead, the question is how to deal with
these limitations to move machine learning to the next phase. I believe
the answer is lifelong machine learning or simply
lifelong learning (a.k.a. continuous learning or even continual learning), which tries to mimic "human learning" (we don't know how humans learn) to build a lifelong learning machine. The key characteristic of "human learning" is the continuous learning and
adaptation to new environments - we retain or accumulate the knowledge
gained from past learning and use the knowledge to help future learning
and problem solving with possible adaptations. Existing
isolated machine learning algorithms are not capable of doing that.
However, without the lifelong learning capability, AI systems will
probably never be truly
intelligent. We believe that now is the right time to explore lifelong
learning. Big data offers a golden opportunity
because its large volume and diversity give us abundant information for discovering rich
and commonsense knowledge automatically, which can enable an lifelong
learning machine or agent to continuously learn and
accumulate knowledge, and to become more and more
knowledgeable and better and better at learning.
Human learning is very different: I believe that no human being has ever been given 1000 positive and
1000 negative documents (or images) and asked to learn a text classifier.
As we have accumulated so much knowledge in the past and understand it,
we can usually learn with little effort and few examples. If we don't
have the accumulated knowledge, even if we are given 2000 training
examples, it is very hard to learn manually. For example, I don't
understand Arabic. If you give me 2000 Arabic documents and ask me to
build a classifier, I cannot do it. But that is exactly what the current
machine learning algorithms do. That is not how humans learn.
Related Learning Paradigms: Transfer learning, multitask learning, and lifelong learning
- Characterisitcs of lifelong learning: (1) learning continuously (ideally in the open world), (2) accumulating the previously learned knowledge to become more and more knowledgeable, and (3) using the knowledge to learn more knowledge and adapting it for problem solving.
- Transfer learning vs. lifelong learning: Transfer learning
uses the source domain labeled data to help target domain learning.
Unlike lifelong learning, transfer learning is not continuous and has
no knowledge retention (as it uses source labeled data, not learned
knowledge). The source must be similar to the target (which
are normally selected by the user). It is also only one-directional:
source helps target, but not the other way around because the target has no
or little labeled data.
- Multitask learning vs. lifelong learning: Multitask learning
optimizes learning of multiple tasks. Although it is possible to make
it continuous, multitask learning does not retain any explicit knowledge
except data, and when the number of task is really large, it is hard to
re-learn everything when faced with a new task.
- Lifelong Unsupervised Learning:
- Lifelong topic modeling (ICML-2014, KDD-2014, WWW-2016):
retain and consolidate the topics (knowledge) learned from previous domains and uses the knowledge suitably for future modeling in other domains.
- Lifelong belief propagation (EMNLP-2016): use the knowledge
learned previously to expand the graph and to obtain more accurate
- lifelong information extraction (AAAI-2016): make use of previously learned knowledge for better extraction.
- Lifelong Supervised Learning (ACL-2015, ACL-2017): The ACL-2015 work is about lifelong supervised sentiment classification. The ACL-2017 work is about lifelong learning CRF, which improves its extraction with experiences even after model training, i.e., learning in model execution.
- Open world Learning (a.k.a. open world classification or open classification) (KDD-2016, EMNLP-2017): involves:
(1) traditional classification,
(2) recognizing instances of unseen classes - not seen in training, and
(3) incrementally learning the new/unseen classes.
In the process, the system accumulates more and more knowledge.
This is open world machine learning because it does not make
the tranditional closed world assumption (test classes seen
in training). The learner is self-motivated and it knows what it does and does not know.
TextBook: Zhiyuan Chen and Bing Liu. Lifelong Machine Learning. Morgan & Claypool, 2016.
- Lei Shu, Hu Xu, and Bing Liu. Lifelong Learning CRF for Supervised Aspect Extraction. To appear in Proceedings of Annual Meeting of the Association for Computational Linguistics (ACL-2017, short paper), July 30-August 4, 2017, Vancouver, Canada.
- Lei Shu, Bing Liu, Hu Xu, and Annice Kim. Lifelong-RL: Lifelong Relaxation Labeling for Separating Entities and Aspects in Opinion Targets. Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP-2016), November 1–5, 2016, Austin, Texas, USA.
- Geli Fei, Shuai Wang, and Bing Liu. 2016. Learning Cumulatively to Become More Knowledgeable. Proceedings of SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2016), August 13-17, San Francisco, USA.
- Geli Fei, and Bing Liu. 2016. Breaking the Closed World Assumption in Text Classification. Proceedings of NAACL-HLT 2016 , June 12-17, San Diego, USA.
- Shuai Wang, Zhiyuan Chen, and Bing Liu. Mining Aspect-Speciﬁc Opinion using a Holistic Lifelong Topic Model. Proceedings of the International World Wide Web Conference (WWW-2016), April 11-15, 2016, Montreal, Canada.
- Qian Liu, Bing Liu, Yuanlin Zhang, Doo Soon Kim and Zhiqiang Gao. Improving Opinion Aspect Extraction using Semantic Similarity and Aspect Associations. Proceedings of Thirtieth AAAI Conference on Artificial Intelligence (AAAI-2016), February 12–17, 2016, Phoenix, Arizona, USA.
- Zhiyuan Chen, Nianzu Ma and Bing Liu. Lifelong Learning for Sentiment Classification. Proceedings of the 53st Annual Meeting of the Association for Computational Linguistics (ACL-2015, short paper), 26-31, July 2015, Beijing, China.
- Zhiyuan Chen and Bing Liu. Mining Topics in Documents: Standing on the Shoulders of Big Data.. Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2014), August 24-27, New York, USA. [Code] [Dataset]
- Zhiyuan Chen and Bing Liu. Topic Modeling using Topics from Many Domains, Lifelong Learning and Big Data. Proceedings of the 31st International Conference on Machine Learning (ICML 2014), June 21-26, Beijing, China.
- Zhiyuan Chen, Arjun Mukherjee, and Bing Liu. Aspect Extraction with Automated Prior Knowledge Learning. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014), June 22-27, 2014, Baltimore, USA.
Created on Sep 24, 2014 by Bing Liu.