Search Quality at LinkedIn

Presented to the Bay Area Search Meetup on February 26, 2014

At LinkedIn, we face a number of challenges in delivering high quality search results to 277M+ members. Our results are highly personalized, requiring us to build machine-learned relevance models that combine document, query, and user features. And our emphasis on entities (names, companies, job titles, etc.) affects how we process and understand queries. In this talk, we’ll talk about these challenges in detail, and we’ll describe some of the solutions we are building to address them.

Slides from the presentation

Machine Learning: Stanford Professor Andrew Ng’s Video Lectures

Lecture by Professor Andrew Ng for Machine Learning (CS 229) in the Stanford Computer Science department.

This course (CS229) — taught by Professor Andrew Ng — provides a broad introduction to machine learning and statistical pattern recognition. Topics include supervised learning, unsupervised learning, learning theory, reinforcement learning and adaptive control. Recent applications of machine learning, such as to robotic control, data mining, autonomous navigation, bioinformatics, speech recognition, and text and web data processing are also discussed.

Lecture 1: Professor Ng provides an overview of the course in this introductory meeting.

Lecture 2: Professor Ng lectures on linear regression, gradient descent, and normal equations and discusses how they relate to machine learning.

Lecture 3: Professor Ng delves into locally weighted regression, probabilistic interpretation and logistic regression and how it relates to machine learning.

Lecture 4: Professor Ng lectures on Newton’s method, exponential families, and generalized linear models and how they relate to machine learning.

Lecture 5: Professor Ng lectures on generative learning algorithms and Gaussian discriminative analysis and their applications in machine learning.

Lecture 6: Professor Ng discusses the applications of naive Bayes, neural networks, and support vector machine.

Lecture 7: Professor Ng lectures on optimal margin classifiers, KKT conditions, and SUM duals.

Lecture 8: Professor Ng continues his lecture about support vector machines, including soft margin optimization and kernels.

Lecture 9: Professor Ng delves into learning theory, covering bias, variance, empirical risk minimization, union bound and Hoeffding’s inequalities.

Lecture 10: Professor Ng continues his lecture on learning theory by discussing VC dimension and model selection.

Lecture 11: Professor Ng lectures on Bayesian statistics, regularization, digression-online learning, and the applications of machine learning algorithms.

Lecture 12: Professor Ng discusses unsupervised learning in the context of clustering, Jensen’s inequality, mixture of Gaussians, and expectation-maximization.

Lecture 13: Professor Ng lectures on expectation-maximization in the context of the mixture of Gaussian and naive Bayes models, as well as factor analysis and digression.

Lecture 14: Professor Ng continues his discussion on factor analysis and expectation-maximization steps, and continues on to discuss principal component analysis (PCA).

Lecture 15: Professor Ng lectures on principal component analysis (PCA) and independent component analysis (ICA) in relation to unsupervised machine learning.

Lecture 16: Professor Ng discusses the topic of reinforcement learning, focusing particularly on MDPs, value functions, and policy and value iteration.

Lecture 17: Professor Ng discusses the topic of reinforcement learning, focusing particularly on continuous state MDPs, discretization, and policy and value iterations.

Lecture 18: Professor Ng discusses state action rewards, linear dynamical systems in the context of linear quadratic regulation, models, and the Riccati equation, and finite horizon MDPs.

Lecture 19: Professor Ng lectures on the debugging process, linear quadratic regulation, Kalmer filters, and linear quadratic Gaussian in the context of reinforcement learning.

Lecture 20: Professor Ng discusses POMDPs, policy search, and Pegasus in the context of reinforcement learning.

 

 

MongoDB: Finding distinct items in a collection

The distinct command in MongoDB, find the distinct values for a specified field across a single collection.

The command takes the following form:

{ distinct: "<collection>", key: "<field>", query: <query> }

Examples: Return an array of the distinct values of the field name from all documents in the profile collection:

db.runCommand ( { distinct: "profile", key: "name" } )

or

db.profile.distinct('name')

to get the number of unique names

db.profile.distinct('name').length

Return an array of the distinct values of the field name from the documents in the profile collection where the age is greater than 25:

db.runCommand ( { distinct: "profile", key: "name", query: { age: { $gt: 25 } } } )