Interesting Papers

These are papers/reviews/tutorials that I am reading or have enjoyed reading. The list below is a bit dated. We now keep track of interesting papers and publications via Mendeley.

Private paper repo: 
Showing 97 items
Bayesian Networks Great List of papers on Bayesian Learning Review Great List of papers on Bayesian Learning January 12, 2010 
Machine Learning Measuring and testing dependence by correlation of distances Paper A new distance metric [0,1] that measure dependence between two vectors. It takes into account non-linear dependencies and is only 0 if the two vectors are independent January 25, 2007 
Computational Biology Enhancing scatterplots with smoothed densities Paper Plotting high density scatter plots with smoothing and transparency January 16, 2003 
Computational Biology How does multiple testing correction work? Review Multiple hypothesis testing and correction in biology January 7, 2010 
Computational Biology Simcluster: clustering enumeration gene expression data on the simplex space Paper Clustering of gene expression, uses Aitchisonean distance metric which useful for any data that lives in simplex space (example probabilities or data that sums to a constant) December 27, 2009 
Biology Sequencing technologies — the next generation Review Review on next generation sequencing December 17, 2009 
Computational Biology ARTS: Accurate Recognition of Transcription Starts in Human Paper Multiple string kernels with SVMs for TSS prediction November 16, 2006 
Machine Learning Clustering with shallow trees Paper Clustering method that is intermediary between single linkage hierarchical clustering and affinity propagation November 16, 2009 
Biology ChIP–seq: advantages and challenges of a maturing technology Review Review paper on ChIP-seq and its applications September 26, 2009 
Computational Biology High-throughput chromatin information enables accurate tissue-specific prediction of transcription factor binding sites Paper Integration of chromatin mark data improves TFBS prediction September 22, 2009 
Machine Learning Deep Belief Networks Review Review by Yoshua Bengio on Deep Belief Networks September 21, 2009 
Computational Dendroscope Software Software for visualizing massive networks and trees September 19, 2009 
Machine Learning Lasso, Elastin net and Ridge regression code by Friedman, Tibshirani, Hasti Software MATLAB and R code (glmnet package) September 8, 2009 
Computational Biology BedTools: utilities for comparing genomic features in BED format Software BedTools: utilities for comparing genomic features in BED format September 1, 2009 
Machine Learning VOWPAL WABBIT: Sparse online learning via truncated gradient Paper Very fast online learning August 24, 2009 
Biology Long noncoding RNAs: functional surprises from the RNA world Review review on long non-coding RNAs July 30, 2009 
Boosting ASSEMBLE: Exploiting Unlabeled Data in Ensemble Methods Paper Semi supervised boosting July 18, 2002 
Machine Learning Review on semi supervised learning Review Review on semi supervised learning July 23, 2009 
Boosting Entropy regularized boosting tutorial Review Manfred Warmuth's talk on Entropy Regularized Boosting July 14, 2009 
Machine Learning Tutorial on Machine Learning reductions Review How to convert one type of learning problem into another July 14, 2009 
Machine Learning Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions Review Good review on recommendation systems, collaborative filtering and the linke July 30, 2005 
Machine Learning Network-constrained regularization and variable selection for analysis of genomic data Paper Network contrained regularized regression July 7, 2009 
Computational Biology From DNA sequence to transcriptional behaviour: a quantitative approach Review Transcription, Sequence and nucleosome positioning. Review by Eran Segal June 27, 2009 
Biology Current-generation high-throughput sequencing: deepening insights into mammalian transcriptomes Review Next gen sequencing of transcriptomes June 21, 2009 
Machine Learning Measuring classifier performance: a coherent alternative to the area under the ROC curve Paper An alterative to AUC to measure classifier performance June 21, 2009 
Computational Biology  Analytical methods for inferring functional effects of single base pair substitutions in human cancers Review Inferring functions from mutations in cancer June 16, 2009 
Machine Learning Active learning tutorial Review Active learning tutorial June 15, 2009 
Machine Learning Learning Nonlinear Dynamic Models Paper A different approach for learning HMM/DBN type models June 12, 2009 
Computational GNU Linear programming library Software GNU Linear programming library June 9, 2009 
Machine Learning The Entire Regularization Path for the Support Vector Machine  Paper How to efficiently search the space of regularization parameter C for an SVM June 9, 2009 
Computational Biology Genome-wide association analysis by lasso penalized logistic regression Paper When the number of features is >> number of training examples this is a good methodology to try June 9, 2009 
Boosting Topics in Regularization and Boosting Review Great thesis on various types of regularization in boosting and SVMs June 9, 2009 
Machine Learning Grafting: fast, incremental feature selection by gradient descent in function space Paper The regularization term can be used as a way to figure out the stop feature selection/stopping criterion for boosting March 19, 2003 
Biology Deep cap analysis gene expression (CAGE) Review Description of Deep CAGE technology for identification of TSS May 28, 2009 
Biology Fundamental concepts in genetics Review Nature Review papers on genetics May 26, 2009 
Biology Genetic Mapping in Human Disease Review Review on genome wide association studies by David Altschuler May 27, 2008 
Computational Biology Aneuploidy prediction and tumor classification with heterogeneous hidden conditional random fields Paper L1 regularized optimization models for CNV (Rob Schapire) May 24, 2009 
Computational Biology Statistical Inference in mRNA-Seq: Exploratory Data Analysis and Differential Expression Paper mRNA-seq data normalization and differential expression May 14, 2009 
Computational Probabilistic inference using MCMC methods Review MCMC, Gibbs sampling and other sampling methods September 27, 1993 
Boosting BOOSTING ALGORITHMS: REGULARIZATION, PREDICTION AND MODEL FITTING http:// Review A great statistical review of boosting (regression and classification) June 4, 2007 
Machine Learning VFML (Very Fast Machine Learning) toolkit Software VFML (Very Fast Machine Learning) toolkit for very fast online learning with decision trees and bayesian learning April 18, 2009 
Computational On Estimation of a Probability Density Function and Mode Paper Kernel density estimation May 28, 1962 
Machine Learning Modification of Correlation Kernels in SVM, KPCA and KCCA in Texture Classification Paper Various kernels for sequence/waveform data May 8, 2009 
Machine Learning Pattern Recognition Using Higher-Order Local Autocorrelation coefficients Paper Efficient computation of higher order cross-correlation kernels June 24, 2002 
Machine Learning Comparison of Combining Methods of Correlation Kernels in kPCA and kCCA for Texture Classification with Kansei Information Paper Various kernels for sequence waveform data May 28, 2007 
Machine Learning Signal Theory for SVM Kernel Design with applications to parameter estimation and sequence kernels Paper Kernels for sequences and waveform signals May 7, 2009 
Machine Learning Computing a nearest symmetric positive definite matrix Paper At times a matrix is not symmetric positive definite. This paper explains how to get the nearest psd matrix. Useful for kernel computations. May 17, 1988 
Computational Notes on Functionals and Functional Derivatives Review Useful for understanding functional gradient descent  
Boosting mBoost package documentation Software Documentation of the mBoost R package by Peter Buhlmann May 2, 2009 
Teaching 10 simple rules to mix teaching with research Review  April 27, 2009 
Machine Learning Apache Mahout Software MapReduce based Machine Learning implementation April 18, 2009 
Machine Learning IBM Parallel Machine Learning Toolbox Software Kmeans, SVM paralellized, NOT open source April 18, 2009 
Computational Biology Approaches to comparative sequence analysis: towards a functional view of vertebrate genomes Review Review on comparative sequence analysis April 16, 2008 
Machine Learning A kernel for time series based on global alignments Paper Kernels for time series data that is not phased (synchronized) October 2, 2006 
Machine Learning LibSVM: A Library for Support Vector Machines Software Great documentation on implementation details of various types of SVMs for classification, regression, density estimation etc. April 16, 2009 
Machine Learning Analysis of Switching Dynamics with Competing Support Vector Machines Paper Weighted SVMs for segmentation of mixed signals  
Machine Learning Cost-Sensitive Learning by Cost-Proportionate Example Weighting Paper Cost sensitive learning - includes the fabled weighted SVM April 17, 2003 
Machine Learning Map-Reduce for Machine Learning on Multicore Paper Parallelization of machine learning algorithms October 10, 2006 
Computational Biology Software package for primary analysis of Illumina next gen sequencing assays Software Highly parallelized C++ for primary data analysis of second gen sequencing assays January 24, 2009 
Computational Biology SNP imputation in association studies Review Eran Halperin's review on the use of SNPs and Haplotypes for association studies PART 2 April 13, 2009 
Computational Biology Maximizing power in association studies Review Eran Halperin's review on genome wide association studies PART 1 April 13, 2009 
Computational Convex Optimization Review Book by Stephen Boyd March 6, 2009 
Computational Biology Efficient and accurate P-value computation for Position Weight Matrices Paper Thresholds for PWMs based on a p-value cutoff December 11, 2007 
Computational CloudBurst: Highly Sensitive Short Read Mapping with MapReduce Software Massive parallelization of tag to genome mapping and k-mer manipulation. Based on google's MapReduce and HADOOP March 18, 2009 
Boosting iBoost: Boosting with item set mining Software boosting itemsets January 24, 2009 
Biology E2F in vivo binding specificity: Comparison of consensus versus nonconsensus binding sites Paper Discusses TFs that bind sites that do no have consensus motifs November 13, 2008 
Machine Learning Support Vector Regression Review Tutorial on Support vector regression November 28, 2008 
Boosting Gboost: Graph boosting Software Code for boosting with graph mining January 24, 2009 
Bayesian Networks Graphical Models, Exponential Families, and Variational Inference Review Extensive review on graphical models February 25, 2009 
Biology Nucleosome positioning and gene regulation: advances through genomics Review Great review on the effect of nucleosome positioning on gene regulation February 21, 2009 
Computational Complexity of Finite Functions Review Excellent review paper on computational complexity by Bopanna and Sipser August 15, 1989 
Computational Extremal Combinatorics Review Great book on Advanced topics in computational complexity February 16, 2009 
Computational Biology CoreBoost_HM Paper Boosting to predict TSS using sequence + chromatin mod data January 6, 2009 
Computational Biology CoreBoost Paper Boosting to predict TSS February 7, 2009 
Computational NoteBooks Review Great set of links to reading material for over 400 topics February 5, 2009 
Machine Learning Olivier Bousquet, Stéphane Boucheron and Gábor Lugosi, "Introduction to Statistical Learning Theory" Review Review of Statistical Learning Theory February 5, 2009 
Machine Learning Survey on active learning Review  January 24, 2009 
Computational MATLAB CVX package for convex optimization Software MATLAB CVX package for convex optimization January 24, 2009 
Computational Biology Predicting Unobserved Phenotypes for Complex Traits from Whole-Genome SNP Data Paper Predicting phenotypic traits from SNP data. Try boosting on it. November 23, 2008 
Computational Biology Activity motifs reveal principles of timing in transcriptional control of the yeast metabolic network Paper Potential project for graph boosting November 23, 2008 
Computational Biology A novel method for comparing topological models of protein structures enhanced with ligand information Paper Protein representation November 21, 2008 
Machine Learning Random Forests Paper Bootstrap based method for creating regression and classification trees November 5, 2004 
Computational Biology Bowtie: Ultra fast short read aligner Software Fast alignment of tags to genomes using indexing November 5, 2008 
Computational Reducing the Space Requirement of Suffix Trees Paper How to implement suffix trees efficiently November 26, 1999 
Computational Biology MUMmer: Utlra fast genome aligner Software Very fast sequence matching and aligning November 5, 2008 
Computational Biology SeqAn: C++ sequence library Software C++ library for sequence manipulation November 5, 2008 
Boosting Gradient Tree Boosting for Training Conditional Random Fields Paper sequence labeling method November 4, 2008 
Boosting The boosting approach to machine learning: An overview Review Introductory review on boosting November 29, 2003 
Boosting An introduction to boosting and leveraging Review Detailed review on Boosting and ensemble methods November 29, 2003 
Computational Biology Boolean implication networks derived from large scale, whole genome microarray datasets Paper Extracting boolean implications from microarray data, Could be used as a useful pre-processing before learning  
Computational Biology Extracting binary signals from microarray time-course data Paper Simple method for discretization of microarray data (mostly time course data or data that spans a large dynamic range per gene) May 1, 2007 
Machine Learning Lease Angle and L1 Regression: a Review Review An interesting new method for regression October 27, 2008 
Boosting Sparse Boosting Paper A Boosting technique for regression October 11, 2006 
Boosting Improved Boosting Algorithms Using Confidence-rated Predictions Paper Excellent paper for efficient implementation of Adaboost and variants (such as abstaining) December 30, 1999 
Computational Conjugate gradient method Review Extremely lucidly explained tutorial on the Conjugate gradient method August 4, 1994 
Machine Learning A tutorial introduction to the minimum description length principle Review Review of the MDL principle June 4, 2004 
Computational Compressive Sensing Review A great site on the methods of compressive sensing (a method for compression and transfer of information) October 27, 2008 
Showing 97 items