Current Article  

Kernel trick

The kernel trick was first published inpaper

M. Aizerman, E. Braverman,L. Rozonoer. Theoretical foundations ofpotential function methodpattern recognition learning. AutomationRemote Control, 25:821--837, 1964.

The kernel trick uses Mercer's theorem, which states that any positive definite kernel K(x, y) can be expressed asdot product inhigh-dimensional space.

More specifically, ifkernelpositive semi-definite, i.e.,

then there existsfunction whose imagein an inner product spacepossibly high dimension, such that

The kernel trick transforms any algorithm that solely depends ondot product between two vectors. Whereverdot productused, itreplaced withkernel function. Thus,linear algorithm can easily be transformed intonon-linear algorithm. This non-linear algorithm islinear algorithm operating inrange spaceφ. However, because kernelsused,φ functionnever explicitly computed. Thisdesirable, becausehigh-dimensional space may be infinite-dimensional (as iscase whenkernel isGaussian).

The kernel trick has been appliedseveral algorithmsmachine learningstatistics, including:

The coiner ofterm kernel trickunknown.

See also:


Copyright 2004. All rights reserved.