Concept class

From The Right Wiki
Jump to navigationJump to search

In computational learning theory in mathematics, a concept over a domain X is a total Boolean function over X. A concept class is a class of concepts. Concept classes are a subject of computational learning theory.

Concept class terminology frequently appears in model theory associated with probably approximately correct (PAC) learning.[1] In this setting, if one takes a set Y as a set of (classifier output) labels, and X is a set of examples, the map c:XY, i.e. from examples to classifier labels (where Y={0,1} and where c is a subset of X), c is then said to be a concept. A concept class C is then a collection of such concepts. Given a class of concepts C, a subclass D is reachable if there exists a sample s such that D contains exactly those concepts in C that are extensions to s.[2] Not every subclass is reachable.[2][why?]

Background

A sample s is a partial function from X[clarification needed] to {0,1}.[2] Identifying a concept with its characteristic function mapping X to {0,1}, it is a special case of a sample.[2] Two samples are consistent if they agree on the intersection of their domains.[2] A sample s extends another sample s if the two are consistent and the domain of s is contained in the domain of s.[2]

Examples

Suppose that C=S+(X). Then:

  • the subclass {{x}} is reachable with the sample s={(x,1)};[2][why?]
  • the subclass S+(Y) for YX are reachable with a sample that maps the elements of XY to zero;[2][why?]
  • the subclass S(X), which consists of the singleton sets, is not reachable.[2][why?]

Applications

Let C be some concept class. For any concept cC, we call this concept 1/d-good for a positive integer d if, for all xX, at least 1/d of the concepts in C agree with c on the classification of x.[2] The fingerprint dimension FD(C) of the entire concept class C is the least positive integer d such that every reachable subclass CC contains a concept that is 1/d-good for it.[2] This quantity can be used to bound the minimum number of equivalence queries[clarification needed] needed to learn a class of concepts according to the following inequality:FD(C)1#EQ(C)FD(C)ln(|C|).[2]

References

  1. Chase, H., & Freitag, J. (2018). Model Theory and Machine Learning. arXiv preprint arXiv:1801.06566.
  2. 2.00 2.01 2.02 2.03 2.04 2.05 2.06 2.07 2.08 2.09 2.10 2.11 Angluin, D. (2004). "Queries revisited" (PDF). Theoretical Computer Science. 313 (2): 188–191. doi:10.1016/j.tcs.2003.11.004.