The LZ77 compression algorithm of Ziv and Lempel is a dictionarybased algorithm which uses a history array or slidingwindow of consecutive alreadyseen symbols. However, the input symbols need not be contiguous in the history array: they can be entered in the array using a contextdriven hash function. The same hash function is thus able to "predict" the symbols using the context hash or position in the array. If the input symbol matches the symbol in the hashed window position, a boolean TRUE is transmitted. Otherwise, a boolean FALSE is output, and the mispredicted symbol is also emitted. The window position is then updated with the mispredicted symbol. The above algorithm, due to Kasman E. Thomas (1993), is employed in the PPP Predictor Compression Protocol. According to the Data Compression FAQ of the internet newsgroup comp.compression, it was first described by T. Raita and J. Teuhola (1987) in the paper "Predictive text compression by hashing", ACM Conference on Information Retrieval. The algorithm, now widelyknown as LZP (LZ Predictive), is based on the idea that "the current context is an excellent predictor of the next symbol" [Fenwick 1997] [Bloom 1996]. It has since then been developed with varying context coding functions along with different symbol ranking schemes. Bloom, for example, has many advanced versions of LZP. (It is interesting to note that the same contextbased prediction method was already anticipated by C. E. Shannon in his 1951 paper "Prediction and Entropy of Printed English," Bell System Technical Journal, Vol. 30, January 1951, pp. 5064.)
References: 1.] P. Fenwick, "Symbol Ranking Text Compression with Shannon Recodings," Journal of Universal Computer Science, Vol. 3, No. 2, Feb. 1997, pp.7085. 2.] C. Bloom, "LZP: a new data compression algorithm". Source Code: PRAQ.ZIP  includes a PPP predictor implementation which improves compression by counting the number of prediction bits (i.e., correctlypredicted symbols) and encoding the resulting sum as a variablelength integer. The input bytes are symbolranked according to recency of occurrence (MTF), and encoded using the same variablelength coding function.
