Pattern Recognition and Machine Learning (Information Science and Statistics)

ByChristopher M. Bishop

feedback image
Total feedbacks:31
16
3
2
4
6
Looking forPattern Recognition and Machine Learning (Information Science and Statistics) in PDF? Check out Scribid.com
Audiobook
Check out Audiobooks.com

Readers` Reviews

★ ☆ ☆ ☆ ☆
sarah laing
It's hard to figure out who would actually benefit from this book - it amounts to seven hundred pages of equations interrupted by blocks of text that fail to provide any intuition whatever for the techniques they are describing, and the occasional graph which is remarkable in the universe of graphs as being scarcely more informative than the equations it is meant to illustrate.

Seriously, you have to wonder wtf Bishop thought he was doing here. As a catalog of equations for people who already thoroughly understand the learning algorithms I suppose the book can be considered adequate. For any didactic purpose you're wasting your time - you can find dense, technically correct but incomprehensible descriptions for any of these methods online, for free. A textbook ought to aspire to more - should bring some order to the chaos, re-tell a technical story in a new light to make it more sensible and intuitive. This book is so bad in these regards that it makes me angry.

On a related note, I can't believe that Duda and Hart is still the best machine learning / pattern rec. book on the market after thirty years or whatever. This field is dying for a book by someone with even an INKLING of how to teach, or at least willing to make an effort to try.
★ ★ ★ ★ ★
andrea tripp
I strongly recommend Bishop's Pattern Recognition and Machine Learning as a textbook. I think it's the best introductory book on this topic. However if you need something advanced, this may not be it. Good luck!
★ ☆ ☆ ☆ ☆
kristopher
This book has the simple math represented so complexly with annoying notations!!. It has lot of Bayesian and Gaussian math rather than the concepts of PR. It is hard to read and understand. It is a better text book for Statics Dept than CS dept.
Deep Learning (Adaptive Computation and Machine Learning) :: Second Edition (Springer Series in Statistics) - and Prediction :: and the Restoration of Everything You Love :: Shiver (Unbreakable Bonds Series Book 1) :: Learning From Data
★ ★ ☆ ☆ ☆
r j ripley
I can appreciate others who might think that this is a great book.... but I am a student using it and I have some very different opinions of it.

First, although Mr. Bishop is clearly an expert in Machine Learning, he is also obviously a HUGE fan of Bayesian Statistics. The title of the book is misleading as it makes no mention of Bayes at all but EVERY CHAPTER ends with how all of the chapter's contents are combined in a Bayes method. That's not bad it's just not clear from the title. The title should be appended with "... using Bayesian Methods"

Second, while it is certainly a textbook, the author clearly has an understanding of the material that seems to undermine his ability to explain it. Though there are mentions of examples there are, in fact, none. There are many graphics and tiny, trivial indicators, but I can't help to think that every single one of the concepts in the book would have benefited from even a single application. There aren't any. I am lead to believe that if you are already aware of many of the methods and techniques that this would be an excellent reference or refresher. As a student starting out I almost always have no idea what his intentions are.

To make matter worse, he occasionally uses symbols that are flat-out confusing. Why would you use PI for anything other than Pi or Product? He does. Why use little k, Capital K, and Greek Letter Kappa (a K!) in a series of explanations. He does. He even references articles that he has written... in 2008!!

Every chapter seems to be an exercise to see how many equations he can stuff in it. There are 300 in Chapter 2 alone. Over and over and over again I have the feeling that he is trying to TELL me how to ride a bicycle when it would have been so much easier to at least let me see the view from behind the handle bars with my feet on the pedals. Chapter five on Neural Nets, for example, is abysmally over-complicated. Would you hand someone a dictionary and ask them to write a poem? ("Hey, all the words you need are in here!") Of course not.

Third, the book mentions that there is a lot of information available on the web site. The only info available on his website is a brief overview of the text, a detailed overview of the text (that's not a typo.... he has both), an example chapter, links to where the book can be purchased, and (actually, quite useful for creating slides) an archive of all of the figures available in the book. There are no answers to problems or explorations of any part of the material. The upcoming book might be amazing and exactly what I am looking for but it could be months away and another $50 or so to purchase it. Hardly ideal. How about putting some of that MatLab code on your site? *Something* to crystalize the concepts!

Finally, while the intro indicates this might be a good book for Computer Scientists it would actually make more sense to call it a Math book. More specifically a Statistics book. There are no methods, no algorithms, no bits of pseudo-code, and (again) no applications are in the text. Even examples that actually used hard numbers and/or elements from a real problem and explained would be much appreciated.

Maybe I am being a little critical and perhaps I want for too much but in my mind if you are writing a book with the goal of TEACHING a subject, it would be in your interest to make things clear and illustrative. Instead, the book feels more like a combination of "I am smart. Just read this!" and a reference text.
★ ★ ★ ★ ☆
seizure romero
What I like about this book is the wonderful insight Bishop brings to the topic of machine learning. I am currently reading it for the second time. The Bayesian perspective adds a whole new dimension - for example the coverage of the bias/variance problem is worth the price of admission on its own. I also like the way he does deal with the difficulties of actually applying Bayesian techniques - given the intractability of the Bayesian calculations which are NP-complete.

As others have pointed out, this book is very math-heavy and concise. I would put it at the graduate level or slightly below. I would strongly suggest reading something else first as an introduction to ML, perhaps Mitchell's Machine Learning or even the Coursera online learning course plus Andrew Ng's lecture notes which you can find online. Certainly if you are looking for a cookbook this is not for you.

Having said that, I found that the book just brims with insight and new ways of looking at familiar problems. It is one of my favorite books on ML.

One small note: if you buy the international edition you may have trouble because many of the illustrations rely strongly on color, and the copy I bought was monochrome. This makes a hard read even harder.
★ ★ ★ ☆ ☆
sonechka
I had the opportunity to purchase this book during my Graduation course. I was new to the coursework of Machine Learning, and found it very tough, despite being quite good in mathematics.

The book is very good in it's content, but, what I have found is that it is quite heavy in terms of mathematics, and a person who is beginner in this field will not able to understand without any external help. We need a teacher to help us through the equations which are given here, and even the problem solving needs bit of assitance. Mr Bishop has uploaded the solutions, but even they too are confusing.

The first two chapters are still fine, but the real complexity starts from the topic of Regression. But the topic of Neural Network has been covered quite well, and this book is worth a read for Neural Networks.

I would recommend this book, but, not for beginners.
★ ★ ★ ★ ★
zeyad
This is a truly amazing book for someone who wishes to understand the mathematical theory behind machine learning: it is crystal clear, comprehensive, very interesting to read, with just the right amount of detail, and almost error-free. I consider it one of the best written textbooks, not just in ML but overall.

I have been working as a SW engineer in this area for many years. I have research background in theoretical computer science, but, until now, I had never studied machine learning in detail. I always wanted to understand how all the well known ML methods really work mathematically, including all gory details. I thus bought this book for self-study, and I loved it in all aspects. I was able to follow all chapters, and verify all reasoning and almost all equations. Most equations are derived from basic assumptions, and the intermediate steps are well chosen, so it was possible for me to follow the derivation in my mind or with a piece of paper. I have good background in linear algebra and probability theory, and this is sufficient to understand the book in all its glory. All needed mathematical theory is derived with sufficient level of detail from scratch in the appendices.

The book is very well written and crystal clear. The text is so well written that, in most places, no word could be added nor deleted without harming it. The ordering of the chapters is very well chosen such that knowledge gained in earlier chapters is used and reinforced in later chapters. The most important ideas are repeated when applied to a problem, but I didn't find the repetition bad. Instead, the repetition serves to enhance clarity and emphasize the important points. The equations are almost all correct, and most remaining typos are very minor and only in the intermediate derivations. The mathematical notation is almost always consistent and easy to follow.

The book should elaborate a bit more on boosting. This chapter is clearly a bit too short compared to other chapters, but otherwise the length of the remaining chapters is just right.

Reading the other reviews, I don't think there is too much Bayesian viewpoint in this book. Most of its content is applicable in both viewpoints, and the Bayesian inference is clearly separated from the previous steps. If you don't like it, you can always skip over it.
★ ★ ★ ★ ★
david brawley
This is definitely not a 'reading' book, in the sense of an easy reading, or a book that you would read in bed. The right way to use this book is with a blank sheet of paper aside to derive the equations in your own (although it doesn't say it, that seems to be the intention of the author).
If you are self-studying, and have some knowledge of multi-variate calculus and linear algebra in order to derive the equations in your own, you'll get a deep understanding of the content. Also, the math in the book is actually pretty easy to follow if you don't skip chapters. (I am a physics undergraduate)

I have read some others books in this area:

Bayesian Reasoning and ML: it's good for the 'hands on' aspects of ML, but I found it lacking a base. It has a lot of examples but it's generally hard to get the main idea. It is clear that the intention of the author is to make the topics as intuitive as possible, but if you come from a math-background instead of a CS-background, you will likely feel overwhelmed by calculations. In contrast, Bishop's PRML is much concrete and figures are much better, so you get a better intuition overall.

Mackay's Information Theory, Inference and Learning Algorithms: an excellent book, intuitive and illuminating. However, it can't replace PRML, most of the topics differ from one to the other. And if you like to find some order in your studying, you will struggle with Mackay's.

You won't find algorithms, methods or code in this book. So I won't say it's the right book for students that just want to program ML. You will find a lot of illuminating equations backed-up with excellent figures. If you want to truly understand the ideas and math of complex concepts in ML, this is the right book.
★ ★ ★ ☆ ☆
hilarymiller917
This book is a fairly thorough overview of typical topics employed in a graduate machine learning course. However, from page 5 on, expect to see more equations on each page than paragraphs of text (with most of the remaining text explaining the context of the variables within the equations). Now, for someone such as myself who enjoys mathematics, this is not a problem. However, I would not recommend this book for someone with a mathematics background that is in any way weak. Furthermore, there is a more fundamental problem with the presentation of the material that warrants this book no more than a 3-star rating: the simple intuitiveness of the concepts is completely lost within the mathematics. Instead of explaining what variables represent and leaving it to the reader to figure out what is going on, this book could be made much more approachable by simply stating the intuition behind the equations. Take the sum rule, one of the first theorems in the book, for an example of how the author muddles what is effectively a basic and intuitive concept: the book has a fairly lengthy definition of several variables representing concepts such as "the number of observations in which x_ij appears" prior to presenting a summation over all y-variables (a notational convention that the author admits is "cumbersome" on the next page, and states that "there will be no need for such pedantry" as that which he proceeds to perpetrate throughout the book!), while he could have simply presented the simplified sum on the following page (p(X) = sum(p(X,Y), Y)) and it would be immediately clear to most readers what he was attempting to explain. He could also simply state the intuition behind the theorem in English, that summing over every event yields a probability of one, and therefore summing over all events in which a variable appears effectively marginalizes the variable (something he comes close to doing after the presentation of the equation, but by then, the reader's time has already been wasted). Similar examples abound throughout the book, becoming particularly bad during the middle sections, when the techniques begin to become less intuitive.

As another reader mentioned, the author also commits the serious mistake of using pi for a symbol other than the constant or the product operator, which muddles the equations on a skim and forces the reader to refer back to the variable definitions to determine the context.

Having done work in machine learning's applied cousin, data mining, and thus having used many of the techniques presented in the book in actual research, I can't help but think that the presentation of the book's content could be much clearer. When doing work in the field, we can look up the equations as-needed; it is the knowledge of *when* and *how* to apply or extend these techniques that is more important, and that is the area in which I feel this book is lacking.
★ ★ ☆ ☆ ☆
abi beaudette
The book seems to be comprehensive, however, the author doesn't know how to present these materials clearly. It might be OK as a reference book, but definitely not an ideal textbook. Here are a few things I don't like about this book :
1) wordy. There are awfully lots of formulas, most of them are straightforward. However, the author always attempts to give a few lines of explanations on every single step of derivation. This makes derivation unnecessarily long and unattractive. You can see formulas awkwardly embedded in a full page of words, but I would say 80% of the words deliver trivial information. Can you just skip all of the words? Perhaps not, you may not follow the logic sometimes, unless you know the material before.

2) lack of organization. I mean it has no 'definition - theorem - example - remark ' kind of thing. All the stuff is mixed together. You feel like you are reading some boring government tech report, or a mediocre sci-fi novel, just with a few more formulas. As I stated before, roughly only 20% of the words is important, you just have a difficult time extract them out. Duda & Hart's book is way better than this one in this aspect.

bottom line: Don't waste your time reading the book.
★ ★ ★ ★ ★
will willis
I've read many books on statistical pattern recognition and machine learning, and this is my favorite to date. This book is more focused than AIMA (Artificial Intelligence, A Modern Approach), so it serves a complementary role to this classic text.

The beginning lays a solid foundation on probability, decision theory and information theory. I was most interested in the chapters on Graphical Models, Kernel Methods, and Mixture Models & EM. The chapter on Graphical Models is available for preview on Bishop's site.

In addition to providing an insightful and coherent explanation of these techniques, he also introduces some ideas that were new to me: Relevance Vector Machines (as opposed to Support Vector Machines) and Variational Inference. His references are quite recent, and many are from pending texts and articles (It's funny to be reading the book in 2006 and see a reference from 2007.) Better still, soon he will release an accompanying library of Matlab algorithms.

This is a cutting-edge, well-written book. The writing is clear; this is the same author who wrote the widely adopted text "Neural Networks for Pattern Recognition". 5 stars...
★ ★ ☆ ☆ ☆
danisha
This books reviews the personal illuminations of the author about the fields of Pattern Recognition and Machine Learning. As such it is quite interesting, but only if you have a deep understanding of the field already and want to see a new view on the field. In an interview on Microsoft's Channel 9 (the author works for Microsoft Research, Cambridge, England), the author mentions that this book provides a unifying view of the field with recent break-throughs. The unification is through Bayesian approach. For those that don't know, Bayesian means lots of math and integrals, lots of computation and "you should believe me, this is the right model, because it's Bayesian". From what I've seen Bayesian methods and graphical models seem to work best for images. In practice one is faced with much more acute problems than finding the right values for the hyper-parameters. Topics like feature selection, data imbalance are not discussed.
My feeling is that this book is not appropriate for practitioners in Machine learning and Pattern Recognition. It does not offer any statistical intuition why methods work and how to reason about problems. It's very math heavy, but in my opinion the more math-heavy a machine learning algorithm, the worst it is in practice. Statistical intuition and under what conditions a method works are absent.
I like the book of Hastie, Tibshirani, Friedman: "The elements of statistical learning" that offers much more intuition(discussion), less useless math (from practical point of view) and more experiments. Friedman has produced one of the best-working machine learning algorithms to-date: gradient-boosted decision trees, by putting together four components proven to work in practice (boosting, decision-trees, bagging, and linear combination of week classifiers). Hastie, Tibshirani, Friedman have took lifetimes to think what works and how models related to each other. Graphical models, kernel methods, neural networks: that stuff is only good if you want to write papers, but not solve problems.
Again, a very nice theoretical book, but not useful for the practitioner. This books should be called "My personal unifying theory of Machine Learning and Pattern Recognition using the Bayesian Approach".
★ ★ ☆ ☆ ☆
allison means
I'm currently using this textbook for a class, and I have to say that it is the WORST text book I have ever read. Its explanations are never clear and always cluttered with pointless notation which obfuscates its readability.

For instance, it will constantly explain things like "index x whose range is 1...X" for some complicated equation, and then sort of skim over what is actually going on in the rest of the equation. Just a clue: If I could understand the dense, utterly frustrating, notation-crufty equations you let pass unexplained, it would be IMMEDIATELY OBVIOUS (as it already is) that X was the upper bound on your indexing variable x. In fact, you wouldn't even need to explain that x was an indexing variable: I would be able to tell from its use in your sum notation (as I already am). Use the text to actually EXPLAIN IN ENGLISH the significance of the OBSCURE parts of your notation.

This book focuses on explaining the trivially obvious points of its equations and leaves out CLEAR and STAIGHT-FORWARD explanations for what the processes going on in its notation mean. The only reason I am giving it two stars is because it is obviously a wonderful book for someone who is a graduate-level math student, not a vanilla computer science student (even a fairly math savvy one).
★ ☆ ☆ ☆ ☆
james oswald
This is a heavy-math approach to ML, which however doesn't explain the insight or the necessity behind the equations, which by the way are not solved analytically, but they are presented as mathematical formulas. Thus I would say this book is a mathematical encyclopedia used my people working on ML and should probably change its title to "Mathematical Formulas for ML" or "ML: A Bayesian Approach" or so. Finally I would say the book audience should be PhD students and researchers.
Apparently the title is such to make the book marketable.
★ ★ ★ ★ ★
joel hamill
This book brings the most updated research in this field. The writing stile combines common-sense intuitive explanations with precise mathematical formulations. A lot of colorful figures support the text and help the reader to understand and absorb the described ideas. Short biographies of scientists like Bayes, Laplace, Gauss etc. (which unfortunately substantially drop after the Ch. 2) provide a brief glancing on humans which are behind these great names. The author makes connections between the different chapters, which help the reader to see a wide picture. But don't expect for an easy work. As every deep scientific text it is sometimes fluent and fun, and sometimes demands an effort, rereading the same text again and again, and referring to other references. Personally I feel a great satisfaction when after such an effort the concept became clear to me.

The other useful feature is solved exercises which are available for download from the authors' web site [..]

The main drawback of this book is a relative small amount of detailed examples. As an experienced educator, I know that "a single good example could worth a thousand explanations". It probably will be not an issue with appearance of the practical companion volume (Bishop and Nabney, 2008). The reference to the future (2008) still un-existed publication is unusual, fresh-thinking, and right idea.

With this book C. Bishop continues his "tradition" of writing deep and important scientific books which was started with the "Neural Networks for Pattern Recognition".

A short comment to the reviewer "lew lwndn123", who is deeply disappointed by the fact that this is a textbook. Yes, it is a textbook, and it is clearly written in the "Book Description". It is unfair to "kill" the book just because you didn't really check what you are going to buy, especially you admit that "as a textbook, this is very good text, and deserves 5 starts". I think it will be a decent step if you will correct your review.
★ ★ ★ ★ ★
ifjuly
Christopher Bishop has a talent for explaining complex subjects. With a background in Data Mining, I think this book is very well written compared to some of the other top books (Elements of Statistical Learning, Pattern Classification, ...). It does get to some in-depth subjects that are beyond me, but the author does a great job of building up to them. He provides alot of introductory material (a whole chapter on probability). After looking at quite a few papers on EM, I felt the chapter on the subject in this book was great. He is also one of the leaders in Graphical Models (which attracted me to this book), and he does a fantastic job in the GM chapter.

This book covers so much material at just the right level (mostly). Definitely recommended!
★ ★ ★ ★ ☆
ehaab
This book is quite good in explaining basics of pattern recognition and machine learning and enables the reader to relate the theory to diverse practical applications. The explanations are very simple. It is better to have thorough knowledge of random vectors and linear algebra to derive maximum benefit from this book. I would recommend this book to any one new to this field.
★ ★ ★ ★ ★
marjorie towers
As a graduate student doing research in Computer Vision, I have found Bishop's book to be an excellent reference. I purchased the book to help myself pick up some important techniques that were never covered in my formal coursework. I certainly haven't read it in its entirety yet, but have read many sections and am impressed with the explanations given. The book covers a broad spectrum of topics (just what I wanted in that regard), some complicated, and does so in a pleasantly clear and intuitive manner. I also found the brief biographies on mathematicians I've heard of over the years very interesting. Overall, an excellent reference!
★ ★ ★ ★ ★
becki hinson
Excellent book for pattern analysis and classification! It begins with basic data curve fitting, linear classification models and ends with combining models (tree-based models, graphical models, etc.). Contains great number of examples and exercises. Very good introductory for beginners in pattern analysis, excellent companion for academics and researchers.
★ ★ ★ ★ ★
claudia webb
Bishop does an excellent job of conveying an intuitive understanding of a wide and complex range of topics. Where so many authors just present theorems and proofs, this book goes to the trouble of showing graphically what is going on with the various problems and techniques described. If you are among the target audience specified in the "Book Description" (advanced undergraduate upwards) you should be able to follow the notation; and you will not be disappointed to discover that "this is a textbook" because the description clearly states that it is!
★ ★ ★ ★ ★
fion
Provides a simple introduction to probability theory, but also contains some of the best explanations available on some advanced topics like variational approximations and relevance vector machines. The whole book is easy to read, with good examples. Note that if you are interested in model selection in the variational approximation section- you should download the errata- try searching for "Pattern Recognition and Machine Learning Errata".
★ ★ ★ ★ ★
timothy york
This book is a delight to read for those interested in pattern recognition and machine learning. It presents in a clear and elegant way the fundamental ideas of these fast moving research fields. For example, the chapter on Graphical Models introduces sophisticated algorithms incrementally with a good balance of illustrations on small examples and general case discussions. This book is an excellent reference book for PR/ML researchers, PhD students and the more advanced undergraduate students.
★ ★ ★ ★ ☆
verity mclellan
Very good book on probabilistic approach to machine learning. It goes from the elementary building blocks of probability distributions, up to the higher level frameworks of Bayesian Networks and Factor Graphs. The best book I've read so far on Bayesian Networks in a continuous and hybrid space. It's quite heavy on math, especially linear algebra and matrix manipulations.
★ ★ ★ ★ ★
tim kleist
The book "Pattern Recognition and Machine Learning" assumes much less math background than other Pattern Recognition books. Christopher Bishop has a talent for explaining complex subjects. I recommend you to start this book.
★ ★ ★ ★ ★
rolynn16
This book gives a comprehensive understanding of machine leraning. The way the author puts forth a myriad of topics is appreciable. The book takes more of an algorithmic standpoint rather than a statistical standpoint on Machine Learning, and is highly recommended for anyone starting in this field.
★ ★ ★ ★ ★
gary winner
Great book!. I recommend it to anyone who wants to learn Machine Learning. The book it's very easy to read. The author starts every topic with very intuitive examples before going into more complex formulations.
★ ★ ★ ★ ★
sherri gardner
I received the book in perfect manner. This is probably the most comprehensive applied machine learning book I have seen so far. Dr. Bishop explained the topics quite in depth and in quite lucid language.
★ ☆ ☆ ☆ ☆
alya
I was expecting that 700+ book will be scientific monograph. Disappointment: this is a textbook, American style textbook, with wide margins to make notes, color text, color frames, color pictures explaining what is linear regression, gaussian distribution and such.

Just to be clear, as a textbook, this is very good text, and deserves 5 starts. But I am giving just one because of disappointment. Sending back to the store. This is not what I was looking for
★ ☆ ☆ ☆ ☆
chantelle
This book gives a rather comprehensive and in depth description of almost all important machine learning techniques. However, I was really disappointed to see that there are absolutely no example applications of these techniques. Its just a book full of theory and equations which lets the reader to figure out how to actually apply these concepts to solve a real problem.
★ ★ ★ ★ ★
abibliofobi
I have been working in the field of signal processing and speech for more

than 40 years at AT&T Bell labs and, more recently, as a professor at

Rutgers University and at the Univ. of California at Santa Barbara where I

teach courses in digital speech processing and speech recognition. I am

extremely impressed with Chris Bishop's "Pattern Recognition and Machine

Learning." The writing style is such that understanding is maximized by the

clarity of thought and examples provided. He did a very nice job with the

Hidden Markov Model material. He is to be congratulated on this excellent

addition to the literature.
★ ☆ ☆ ☆ ☆
kiyo
I found the first chapter very interesting and the rest of the book terrible. I was hoping to use it to learn about new topics. I already have a masters in statistics and even when readying the sections I am well versed in the text was very difficult to follow.
Please RatePattern Recognition and Machine Learning (Information Science and Statistics)
More information