[1]      Altun, Y., Hofmann, T. and Johnson, M. (2002). Discriminative Learning for Label Sequences via Boosting. NIPS.

   [2]      Altun, Y., Johnson, M. and Hofmann, T. (2003). Investigating Loss Functions and Optimization Methods for Discriminative Learning of Label Sequences. ACL.

   [3]      Altun, Y., Tsochantaridis, I. and Hofmann, T. (2003). Hidden Markov Support Vector Machines. ICML.

   [4]      Altun, Y., Hofmann, T. and Smola, A. (2004). Gaussian Process Classification for Segmenting and Annotating Sequences. ICML.

   [5]      Altun, Y., Smola, A. and Hofmann, T. (2004). Exponential Families for Conditional Random Fields. UAI.

   [6]      Bartlett, P., Collins, M., McAllester, D. and Taskar, B. (2004). Exponentiated Gradient Algorithms for Large-margin Structured Classification. NIPS.

   [7]      Bockhorst, J. and Craven, M. (2004). Markov Networks for Detecting Overlapping Elements in Sequence Data. NIPS.

   [8]      Chieu, H. L. and Lee, W. S. (2005). Modeling Physiological Data with Conditional Random Fields. ICML.

   [9]      Collins, M. (2002). Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. EMNLP.

[10]      Fern, A. and Givan, R. (2004). Relational sequential inference with reliable observations. ICML.

[11]      Gartner, T. (2003). "A Survey of Kernels for Structured Data." SIGKDD Explorations 5: 268-275.

[12]      Grandvalet, Y. and Bengio, Y. (2004). Semi-supervised Learning by Entropy Minimization. NIPS.

[13]      Kakade, S., Teh, Y. and Roweis, S. (2002). An alternative objective function for Markovian Fields. ICML.

[14]      Kashima, H. and Tsuboi, Y. (2004). Kernel-based Discriminative Learning Algorithms for Labeling Sequences, Trees, and Graphs. ICML.

[15]      Kassel, R. (1995). A Comparison of Approaches to On-line Handwritten Character Recognition. Spoken Language Systems Group, MIT. PhD. Thesis

[16]      Lafferty, J., McCallum, A. and Pereira, F. (2001). Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. ICML.

[17]      Lafferty, J., X., Z. and Liu, Y. (2004). Kernel conditional random fields: representation and clique selection. ICML.

[18]      McAllester, D., Collins, M. and Pereira, F. (2004). Case-Factor Diagrams for Structured Probabilistic Modeling. UAI.

[19]      McCallum, A. (2003). Efficiently Inducing Features of Conditional Random Fields. UAI.

[20]      Pavlov, D., Popescul, A., Pennock, D. and Ungar, L. (2003). Mixtures of Conditional Maximum Entropy Models. ICML.

[21]      Punyakanok, V. and Roth, D. (2001). The Use of Classifiers in Sequential Inference. NIPS.

[22]      Quattoni, A., Collins, M. and Darrell, T. (2004). Conditional Random Fields for Object Recognition. NIPS.

[23]      Rabiner, L. (1989). "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition." Proc. of the IEEE 77(2): 257-286.

[24]      Sarawagi, S. and Cohen, W. (2004). Semi-Markov Conditional Random Fields for Information Extraction. NIPS.

[25]      Sutton, C., Rohanimanesh, K. and McCallum, A. (2004). Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. ICML.

[26]      Taskar, B., Guestrin, C. and Koller, D. (2003). Max-Margin Markov Networks. NIPS.

[27]      Taskar, B. (2004). Learning Structured Prediction Models: A Large Margin Approach. Department of Computer Science, Stanford University. PhD. Thesis.

[28]      Tsochantaridis, I., Hofmann, T., Joachims, T. and Altun, Y. (2004). Support Vector Machine Learning for Interdependent and Structured Output Spaces. ICML.

[29]      Wallach, H. (2002). Efficiently Training of Conditional Random Fields. Division of Informatics, University of Edinburgh. MSc. Thesis

[30]      Weston, J., Chapelle, O., Elisseeff, A., Schoelkopf, B. and Vapnik, V. (2002). Kernel Dependency Estimation. NIPS.

[31]      Zhou, D., Scholkopf, B. and Hofmann, T. (2004). Semi-supervised Learning on Directed Graphs. NIPS.

[32]      Zhu, J. and Hastie, T. (2001). "Kernel Logistic Regression and the Import Vector Machine." Journal of Computational and Graphical Statistics 14(1): 185-205.