[1] Altun, Y., Hofmann, T. and Johnson, M. (2002). Discriminative Learning for Label Sequences via Boosting. NIPS.
[2] Altun, Y., Johnson, M. and Hofmann, T. (2003). Investigating Loss Functions and Optimization Methods for Discriminative Learning of Label Sequences. ACL.
[3] Altun, Y., Tsochantaridis, I. and Hofmann, T. (2003). Hidden Markov Support Vector Machines. ICML.
[4] Altun, Y., Hofmann, T. and Smola, A. (2004). Gaussian Process Classification for Segmenting and Annotating Sequences. ICML.
[5] Altun, Y., Smola, A. and Hofmann, T. (2004). Exponential Families for Conditional Random Fields. UAI.
[6] Bartlett, P., Collins, M., McAllester, D. and Taskar, B. (2004). Exponentiated Gradient Algorithms for Large-margin Structured Classification. NIPS.
[7] Bockhorst, J. and Craven, M. (2004). Markov Networks for Detecting Overlapping Elements in Sequence Data. NIPS.
[8] Chieu, H. L. and Lee, W. S. (2005). Modeling Physiological Data with Conditional Random Fields. ICML.
[9] Collins, M. (2002). Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. EMNLP.
[10] Fern, A. and Givan, R. (2004). Relational sequential inference with reliable observations. ICML.
[11] Gartner, T. (2003). "A Survey of Kernels for Structured Data." SIGKDD Explorations 5: 268-275.
[12] Grandvalet, Y. and Bengio, Y. (2004). Semi-supervised Learning by Entropy Minimization. NIPS.
[13] Kakade, S., Teh, Y. and Roweis, S. (2002). An alternative objective function for Markovian Fields. ICML.
[14] Kashima, H. and Tsuboi, Y. (2004). Kernel-based Discriminative Learning Algorithms for Labeling Sequences, Trees, and Graphs. ICML.
[15] Kassel, R. (1995). A Comparison of Approaches to On-line Handwritten Character Recognition. Spoken Language Systems Group, MIT. PhD. Thesis
[16] Lafferty, J., McCallum, A. and Pereira, F. (2001). Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. ICML.
[17] Lafferty, J., X., Z. and Liu, Y. (2004). Kernel conditional random fields: representation and clique selection. ICML.
[18] McAllester, D., Collins, M. and Pereira, F. (2004). Case-Factor Diagrams for Structured Probabilistic Modeling. UAI.
[19] McCallum, A. (2003). Efficiently Inducing Features of Conditional Random Fields. UAI.
[20] Pavlov, D., Popescul, A., Pennock, D. and Ungar, L. (2003). Mixtures of Conditional Maximum Entropy Models. ICML.
[21] Punyakanok, V. and Roth, D. (2001). The Use of Classifiers in Sequential Inference. NIPS.
[22] Quattoni, A., Collins, M. and Darrell, T. (2004). Conditional Random Fields for Object Recognition. NIPS.
[23] Rabiner, L. (1989). "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition." Proc. of the IEEE 77(2): 257-286.
[24] Sarawagi, S. and Cohen, W. (2004). Semi-Markov Conditional Random Fields for Information Extraction. NIPS.
[25] Sutton, C., Rohanimanesh, K. and McCallum, A. (2004). Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. ICML.
[26] Taskar, B., Guestrin, C. and Koller, D. (2003). Max-Margin Markov Networks. NIPS.
[27] Taskar, B. (2004). Learning Structured Prediction Models: A Large Margin Approach. Department of Computer Science, Stanford University. PhD. Thesis.
[28] Tsochantaridis, I., Hofmann, T., Joachims, T. and Altun, Y. (2004). Support Vector Machine Learning for Interdependent and Structured Output Spaces. ICML.
[29] Wallach, H. (2002). Efficiently Training of Conditional Random Fields. Division of Informatics, University of Edinburgh. MSc. Thesis
[30] Weston, J., Chapelle, O., Elisseeff, A., Schoelkopf, B. and Vapnik, V. (2002). Kernel Dependency Estimation. NIPS.
[31] Zhou, D., Scholkopf, B. and Hofmann, T. (2004). Semi-supervised Learning on Directed Graphs. NIPS.
[32] Zhu, J. and Hastie, T. (2001). "Kernel Logistic Regression and the Import Vector Machine." Journal of Computational and Graphical Statistics 14(1): 185-205.