Wikipedia - User contributions [en]

Gabriele Veneziano

2023-09-30T13:44:26Z

Magmalex: /* Life */ added link to Miguel Virasoro

{{Short description|Italian theoretical physicist}}
{{Use dmy dates|date=February 2020}}
{{Infobox scientist
| name = Gabriele Veneziano
| image = GabrieleVeneziano.jpg
| caption = (2007)
| birth_date = {{birth date and age|1942|9|7|df=y}}
| birth_place = [[Florence]], Italy
| field = [[Theoretical physics]]
| alma_mater = [[University of Florence]] [[Weizmann Institute of Science]]
| work_institution = [[CERN]]; [[Collège de France]]
| known_for = [[Veneziano amplitude]] [[Eta_and_eta_prime_mesons#General|Witten–Veneziano mechanism]] [[String theory]] [[String cosmology]]
| awards = [[Dirac Medal]] (2014) [[James_Joyce_Award#Scientists_and_academics|James Joyce Award]] (2009) [[Tomassoni awards|Tomassoni award]] (2009) {{no wrap|[[Oskar Klein Memorial Lecture]] (2007)}} [[Albert Einstein Medal]] (2006) [[Enrico Fermi Prize]] (2005) [[Dannie Heineman Prize for Mathematical Physics|Heineman Prize]] (2004) [[Pomeranchuk Prize]] (1999) [[Racah Lectures in Physics|Racah Lecture]] (1975)
}}

'''Gabriele Veneziano''' ({{IPAc-en|ˌ|v|ɛ|n|ə|t|s|i|ˈ|æ|n|oʊ}};{{citation needed|date=August 2018}} {{IPA-it|venetˈtsjaːno|lang}}; born 7 September 1942)<ref>[http://www.college-de-france.fr/site/en-gabriele-veneziano/biography.htm Biography] {{webarchive|url=https://web.archive.org/web/20120531184311/http://www.college-de-france.fr/site/en-gabriele-veneziano/biography.htm |date=31 May 2012 }} on the [[Collège de France]] website</ref> is an Italian [[theoretical physics|theoretical physicist]] widely considered the father of [[string theory]].<ref name="bio">{{Cite web |title=biography and bibliography |url=http://www.college-de-france.fr/default/EN/all/par_ele_en/index.htm |publisher=College de France |access-date=13 September 2010 |url-status=dead |archive-url=https://web.archive.org/web/20100605233717/http://www.college-de-france.fr/default/EN/all/par_ele_en/index.htm |archive-date=5 June 2010}}</ref><ref>{{cite book|author=Gasperini, F.|title=String Theory and Fundamental Interactions – Gabriele Veneziano and Theoretical Physics: Historical and Contemporary Perspectives|last2=Maharana, J.|date=2008|publisher=Springer|isbn=978-3-540-74232-6|editor1-last=Gasperini|editor1-first=Maurizio|series=Lecture Notes in Physics|volume=737|pages=3–27|chapter=Gabriele Veneziano: A Concise Scientific Biography and an Interview|doi=10.1007/978-3-540-74233-3_1|editor2-last=Maharana|editor2-first=Jnan|chapter-url=http://www.college-de-france.fr/media/par_ele_en/UPL30277_Veneziano.bio.pdf}}{{dead link|date=January 2018|bot=InternetArchiveBot|fix-attempted=yes}}</ref> He has conducted most of his scientific activities at [[CERN]] in Geneva, Switzerland, and held the Chair of Elementary Particles, Gravitation and Cosmology at the [[Collège de France]] in Paris from 2004 to 2013, until the age of retirement there.<ref name=bio />

==Life==
Gabriele Veneziano was born in Florence. In 1965, he earned his [[Laurea]] in Theoretical Physics from the [[University of Florence]] under the direction of {{Interlanguage link multi|Raoul Gatto|it}}. He pursued his doctoral studies at the [[Weizmann Institute of Science]] in [[Rehovot, Israel|Rehovot]], Israel and obtained his PhD in 1967 under the supervision of Hector Rubinstein. During his stay in Israel, he collaborated, among others, with Marco Ademollo (a professor in Florence) and [[Miguel Virasoro]] (an Argentinian physicist who later became a professor in Italy). During his years at MIT, he collaborated with many colleagues, primarily with [[Sergio Fubini]] (an MIT professor, later a member of the Theory Division and of the Directorate at [[CERN]] in Geneva, Switzerland).

Between 1968 and 1972 he worked at [[MIT]] and was a summer visitor of the Theory Division at [[CERN]]. In 1972 he accepted the Amos de Shalit Professor of Physics chair at the Weizmann Institute of Science in Rehovot, Israel. In 1976-1978 he accepted a permanent position in the Theory Division at [[CERN]] in Geneva, Switzerland, a position that he held until the age of retirement in 2007 and where he has been since then Honorary member. Between 1994 and 1997, he was Director of the Theory Division. He also held the chair of Elementary Particles, Gravitation and Cosmology at the [[College of France]] in Paris, France (2004-2013), of which he is currently an Honorary Professor. He visited many Universities all over the world. More recently he was Global Distinguished Professor at [[New York University]] and is Sackler Professor at Tel-Aviv University.

==Research==
Gabriele Veneziano first formulated the foundations of string theory in 1968 when he discovered a string picture that could describe the interaction of strongly interacting particles.<ref>{{cite journal |last=Veneziano |first=G. |date=1968 |title=Construction of a crossing-symmetric, Regge-behaved amplitude for linearly rising trajectories |journal=Nuovo Cimento A |volume=57 |issue=1 |pages=190–7|doi=10.1007/BF02824451|bibcode = 1968NCimA..57..190V |s2cid=121211496 |url=https://cds.cern.ch/record/390478 }}</ref><ref>{{cite journal |author1=Lovelace, C. |author2=Squires, E. |date=1970 |title=Veneziano Theory |journal=Proc. R. Soc. Lond. A |volume=318 | number=1534 |pages=321–353 | doi=10.1098/rspa.1970.0148 |bibcode = 1970RSPSA.318..321L |s2cid=124404183 |url=https://cds.cern.ch/record/350802 }}</ref><ref>{{cite book|last=Di Vecchia|first=P.|title=String Theory and Fundamental Interactions – Gabriele Veneziano and Theoretical Physics: Historical and Contemporary Perspectives|date=2008|publisher=Springer|isbn=978-3-540-74232-6|editor1-last=Gasperini|editor1-first=Maurizio|series=Lecture Notes in Physics|volume=737|pages=59–118|chapter=The Birth of String Theory|lccn=2007934340|ol=OL16156324M|editor2-last=Maharana|editor2-first=Jnan|chapter-url=http://www-hep.physics.uiowa.edu/~vincent/courses/29276/Vecchia.pdf|archive-url=https://web.archive.org/web/20110902183813/http://www-hep.physics.uiowa.edu/~vincent/courses/29276/Vecchia.pdf|archive-date=2 September 2011|url-status=dead}}</ref> Veneziano discovered that the [[Leonhard Euler|Euler]] [[Beta function]], interpreted as a [[scattering amplitude]], has many of the features needed to explain the physical properties of strongly interacting particles. This amplitude, known as the [[Veneziano amplitude]], is interpreted as the scattering amplitude for four [[String (physics)#Types of strings|open string]] [[tachyon]]s. In retrospect this work is now considered the founding of [[string theory]] although at the time it was not apparent the string picture would lead to a new theory of quantum gravity.

Veneziano's work led to intense research to try to explain the [[strong force]] by a [[field theory (physics)|field theory]] of strings about one [[fermi (unit)|fermi]] in length. The rise of [[quantum chromodynamics]], a rival explanation of the strong force, led to a temporary loss of interest in string theories until the 1980s when interest was revived.

In 1991, he published a paper<ref>{{cite journal|last=Veneziano|first=G.|title=Scale factor duality for classical and quantum strings|journal=Physics Letters B|volume=265|issue=3–4|pages=287–294|date=1991|doi=10.1016/0370-2693(91)90055-U|bibcode = 1991PhLB..265..287V |url=https://cds.cern.ch/record/220581}}</ref> that shows how an inflationary cosmological model can be obtained from string theory, thus opening the door to a description of [[String cosmology|string cosmological]] [[pre-big bang]] scenarios.

==Society memberships==
*National Academy of Sciences of Turin (1994)
*Lincei National Academy (1996)
*[[French Academy of Sciences]] (2002)

==Awards==
* [[Pomeranchuk Prize]], 1999
* Gold medal della Repubblica Italiana come Benemerito della Cultura, 2000
* [[Dannie Heineman Prize for Mathematical Physics]], from the [[American Physical Society]], 2004
*Enrico Fermi Prize from the Italian Physical Society, 2005
*[[Albert Einstein Medal]], Albert Einstein Institute, Bern, Switzerland, 2006
*[[Oskar Klein Medal]], 2007
*Commendatore al merito della Repubblica Italiana, 2007
*[[James Joyce Award]], University College Dublin, 2009
*Felice Pietro Chisesi and Caterina Tomassoni Prize, 2009
*Dirac Medal by [[ICTP]], 2014
*Honorary doctorate, Swansea University, 2015<ref>{{Cite web|url=https://www.swansea.ac.uk/graduation/honoraryawards/honoraryawardsarchive/honoraryawards2015/professorgabrieleveneziano/|title=Professor Gabriele Veneziano|website=www.swansea.ac.uk|access-date=2020-02-29}}</ref>
*Friedel-Volterra Prize, by SIF and SFP, 2016–2017

==References==
{{Reflist}}

==External links==
*[https://inspirehep.net/author/profile/G.Veneziano.1 Scientific publications of Gabriele Veneziano] on [[INSPIRE-HEP]]

{{Authority control}}

{{DEFAULTSORT:Veneziano, Gabriele}}
[[Category:1942 births]]
[[Category:Academic staff of the Collège de France]]
[[Category:People associated with CERN]]
[[Category:Living people]]
[[Category:Members of the French Academy of Sciences]]
[[Category:Italian string theorists]]
[[Category:Albert Einstein Medal recipients]]

Center (group theory)

2023-06-08T12:41:41Z

Magmalex: /* As a subgroup */ added abelian property

{{Short description|Set of elements that commute with every element of a group}}
{{Use American English|date=March 2021}}
{{Use mdy dates|date=March 2021}}
{{redirect|Group center|the American counter-cultural group|Aldo Tambellini#Lower East Side artists}}
{| class="wikitable floatright"

|+ style="text-align: left;" | [[Cayley table]] for [[Dihedral group of order 8|D4]] showing elements of the center, {e, a2}, commute with all other elements (this can be seen by noticing that all occurrences of a given center element are arranged symmetrically about the center diagonal or by noticing that the row and column starting with a given center element are [[Transpose|transposes]] of each other).

|-
! ∘ || e|| b|| a|| a2|| a3|| ab|| a2b|| a3b
|- align="center"
! e
| style="background: green; color: white;" | '''e'''|| b|| a|| style="background: red; color: white;" | a2|| a3|| ab|| a2b|| a3b
|- align="center"
! b
| b|| style="background: green; color: white;" | '''e'''|| a3b|| a2b|| ab|| a3|| style="background: red; color: white;" | a2|| a
|- align="center"
! a
| a|| ab|| style="background: red; color: white;" | a2|| a3|| style="background: green; color: white;" | '''e'''|| a2b|| a3b|| b
|- align="center"
! a2
| style="background: red; color: white;" | a2|| a2b|| a3|| style="background: green; color: white;" | '''e'''|| a|| a3b|| b|| ab
|- align="center"
! a3
| a3
|| a3b|| style="background: green; color: white;" | '''e'''|| a|| style="background: red; color: white;" | a2|| b|| ab|| a2b
|- align="center"
! ab
| ab|| a|| b|| a3b|| a2b|| style="background: green; color: white;" | '''e'''|| a3|| style="background: red; color: white;" | a2
|- align="center"
! a2b
| a2b|| style="background: red; color: white;" | a2|| ab|| b|| a3b|| a|| style="background: green; color: white;" | '''e'''|| a3
|- align="center"
! a3b
| a3b|| a3|| a2b|| ab|| b|| style="background: red; color: white;" | a2|| a|| style="background: green; color: white;" | '''e'''
|}
In [[abstract algebra]], the '''center''' of a [[group (mathematics)|group]], {{math|''G''}}, is the [[set (mathematics)|set]] of elements that [[commutative|commute]] with every element of {{math|''G''}}. It is denoted {{math|Z(''G'')}}, from German ''[[wikt:Zentrum|Zentrum]],'' meaning ''center''. In [[set-builder notation]],

:{{math|1=Z(''G'') = {{mset|''z'' ∈ ''G'' | ∀''g'' ∈ ''G'', ''zg'' {{=}} ''gz''}}}}.

The center is a [[normal subgroup]], {{math|Z(''G'') ⊲ ''G''}}. As a subgroup, it is always [[characteristic subgroup|characteristic]], but is not necessarily [[fully characteristic subgroup|fully characteristic]]. The [[quotient group]], {{math|''G'' / Z(''G'')}}, is [[group isomorphism|isomorphic]] to the [[inner automorphism]] group, {{math|Inn(''G'')}}.

A group {{math|''G''}} is abelian if and only if {{math|1=Z(''G'') = ''G''}}. At the other extreme, a group is said to be '''centerless''' if {{math|Z(''G'')}} is [[trivial group|trivial]]; i.e., consists only of the [[identity element]].

The elements of the center are sometimes called '''central'''.

==As a subgroup==
The center of ''G'' is always a [[subgroup (mathematics)|subgroup]] of {{math|''G''}}. In particular:
# {{math|Z(''G'')}} contains the [[identity element]] of {{math|''G''}}, because it commutes with every element of {{math|''g''}}, by definition: {{math|1=''eg'' = ''g'' = ''ge''}}, where {{math|''e''}} is the identity;
# If {{math|''x''}} and {{math|''y''}} are in {{math|Z(''G'')}}, then so is {{math|''xy''}}, by associativity: {{math|1=(''xy'')''g'' = ''x''(''yg'') = ''x''(''gy'') = (''xg'')''y'' = (''gx'')''y'' = ''g''(''xy'')}} for each {{math|''g'' ∈ ''G''}}; i.e., {{math|Z(''G'')}} is closed;
# If {{math|''x''}} is in {{math|Z(''G'')}}, then so is {{math|''x''{{sup|−1}}}} as, for all {{math|''g''}} in {{math|''G''}}, {{math|''x''{{sup|−1}}}} commutes with {{math|''g''}}: {{math|1=(''gx'' = ''xg'') ⇒ (''x''{{sup|−1}}''gxx''{{sup|−1}} = ''x''{{sup|−1}}''xgx''{{sup|−1}}) ⇒ (''x''{{sup|−1}}''g'' = ''gx''{{sup|−1}})}}.

Furthermore, the center of {{math|''G''}} is always an [[abelian group|abelian]] and [[normal subgroup]] of {{math|''G''}}. Since all elements of {{math|Z(''G'')}} commute, it is closed under [[conjugate closure|conjugation]].

Note that a homomorphism {{math|''f'': ''G'' → ''H''}} between groups generally does not restrict to a homomorphism between their centers. Although {{math|''f'' (''Z'' (''G''))}} commutes with {{math|''f'' ( ''G'' )}}, unless {{math|''f''}} is surjective {{math|''f'' (''Z'' (''G''))}} need not commute with all of {{math|''H''}} and therefore need not be a subset of {{math|''Z'' ( ''H'' )}}. Put another way, there is no "center" functor between categories Grp and Ab. Even though we can map objects, we cannot map arrows.

==Conjugacy classes and centralizers==
By definition, the center is the set of elements for which the [[conjugacy class]] of each element is the element itself; i.e., {{math|1=Cl(''g'') = {''g''}<nowiki/>}}.

The center is also the [[intersection (set theory)|intersection]] of all the [[centralizer and normalizer|centralizers]] of each element of {{math|''G''}}. As centralizers are subgroups, this again shows that the center is a subgroup.

==Conjugation==
Consider the map, {{math|''f'': ''G'' → Aut(''G'')}}, from {{math|''G''}} to the [[automorphism group]] of {{math|''G''}} defined by {{math|1=''f''(''g'') = ''ϕ''{{sub|''g''}}}}, where {{math|''ϕ''{{sub|''g''}}}} is the automorphism of {{math|''G''}} defined by
:{{math|1=''f''(''g'')(''h'') = ''ϕ''{{sub|''g''}}(''h'') = ''ghg''{{sup|−1}}}}.

The function, {{math|''f''}} is a [[group homomorphism]], and its [[kernel (algebra)|kernel]] is precisely the center of {{math|''G''}}, and its image is called the [[inner automorphism group]] of {{math|''G''}}, denoted {{math|Inn(''G'')}}. By the [[first isomorphism theorem]] we get,
:{{math|''G''/Z(''G'') ≃ Inn(''G'')}}.

The [[cokernel]] of this map is the group {{math|Out(''G'')}} of [[outer automorphism]]s, and these form the [[exact sequence]]
:{{math|1 ⟶ Z(''G'') ⟶ ''G'' ⟶ Aut(''G'') ⟶ Out(''G'') ⟶ 1}}.

==Examples==

* The center of an [[abelian group]], {{math|''G''}}, is all of {{math|''G''}}.
* The center of the [[Heisenberg group]], {{math|''H''}}, is the set of matrices of the form: <math display="block"> \begin{pmatrix}
1 & 0 & z\\
0 & 1 & 0\\
0 & 0 & 1
\end{pmatrix}</math>
* The center of a [[nonabelian group|nonabelian]] [[simple group]] is trivial.
* The center of the [[dihedral group]], {{math|D{{sub|''n''}}}}, is trivial for odd {{math|''n'' ≥ 3}}. For even {{math|''n'' ≥ 4}}, the center consists of the identity element together with the 180° rotation of the [[polygon]].
* The center of the [[quaternion group]], {{math|1=Q{{sub|8}} = {1, −1, i, −i, j, −j, k, −k} }}, is {{math|{1, −1}<nowiki/>}}.
* The center of the [[symmetric group]], {{math|''S''{{sub|''n''}}}}, is trivial for {{math|''n'' ≥ 3}}.
* The center of the [[alternating group]], {{math|''A''{{sub|''n''}}}}, is trivial for {{math|''n'' ≥ 4}}.
* The center of the [[general linear group]] over a [[Field (mathematics)|field]] {{math|F}}, {{math|GL{{sub|''n''}}(F)}}, is the collection of [[diagonal matrix|scalar matrices]], {{math|{{mset| sI''n'' ∣ s ∈ F \ {0} }}}}.
* The center of the [[orthogonal group]], {{math|O''n''(F)}} is {{math|{I''n'', −I''n''}<nowiki/>}}.
* The center of the [[special orthogonal group]], {{math|SO(''n'')}} is the whole group when {{math|1=''n'' = 2}}, and otherwise {{math|{{mset|I''n'', −I''n''}}}} when ''n'' is even, and trivial when ''n'' is odd.
* The center of the [[unitary group]], <math>U(n)</math> is <math>\left\{ e^{i\theta} \cdot I_n \mid \theta \in [0, 2\pi) \right\}</math>.
* The center of the [[special unitary group]], <math>\operatorname{SU}(n)</math> is <math display="inline">\left\lbrace e^{i\theta} \cdot I_n \mid \theta = \frac{2k\pi}{n}, k = 0, 1, \dots, n-1 \right\rbrace </math>.
* The center of the multiplicative group of non-zero [[quaternion]]s is the multiplicative group of non-zero [[real number]]s.
* Using the [[class equation]], one can prove that the center of any non-trivial [[finite group|finite]] [[p-group]] is non-trivial.
* If the [[quotient group]] {{math|''G''/Z(''G'')}} is [[cyclic group|cyclic]], {{math|''G''}} is [[abelian group|abelian]] (and hence {{math|1=''G'' = Z(''G'')}}, so {{math|''G''/Z(''G'')}} is trivial).
* The center of the [[megaminx]] group is a cyclic group of order 2, and the center of the [[kilominx]] group is trivial.

==Higher centers==
Quotienting out by the center of a group yields a sequence of groups called the '''[[upper central series]]''':

:{{math|1=(''G''{{sub|0}} = ''G'') ⟶ (''G''{{sub|1}} = ''G''{{sub|0}}/Z(''G''{{sub|0}})) ⟶ (''G''{{sub|2}} = ''G''{{sub|1}}/Z(''G''{{sub|1}})) ⟶ ⋯}}

The kernel of the map {{math|''G'' → ''G{{sub|i}}''}} is the '''{{math|''i''}}th center'''<ref>{{Cite journal |last=Ellis |first=Graham |date=1998-02-01 |title=On groups with a finite nilpotent upper central quotient |url=https://doi.org/10.1007/s000130050169 |journal=Archiv der Mathematik |language=en |volume=70 |issue=2 |pages=89–96 |doi=10.1007/s000130050169 |issn=1420-8938}}</ref> of {{math|''G''}} ('''second center''', '''third center''', etc.) and is denoted {{math|Z{{sup|''i''}}(''G'')}}.<ref>{{Cite journal |last=Ellis |first=Graham |date=1998-02-01 |title=On groups with a finite nilpotent upper central quotient |url=https://doi.org/10.1007/s000130050169 |journal=Archiv der Mathematik |language=en |volume=70 |issue=2 |pages=89–96 |doi=10.1007/s000130050169 |issn=1420-8938}}</ref> Concretely, the ({{math|''i'' + 1}})-st center are the terms that commute with all elements up to an element of the {{math|''i''}}th center. Following this definition, one can define the 0th center of a group to be the identity subgroup. This can be continued to [[transfinite ordinals]] by [[transfinite induction]]; the union of all the higher centers is called the '''[[hypercenter]]'''.<ref group="note">This union will include transfinite terms if the UCS does not stabilize at a finite stage.</ref>

The [[total order#Chains|ascending chain]] of subgroups
:{{math|1 ≤ Z(''G'') ≤ Z{{sup|2}}(''G'') ≤ ⋯}}
stabilizes at ''i'' (equivalently, {{math|1=Z{{sup|''i''}}(''G'') = Z{{sup|i+1}}(''G'')}}) [[if and only if]] {{math|''G''{{sub|''i''}}}} is centerless.

===Examples===
* For a centerless group, all higher centers are zero, which is the case {{math|1=Z{{sup|0}}(''G'') = Z{{sup|1}}(''G'')}} of stabilization.
* By [[Grün's lemma]], the quotient of a [[perfect group]] by its center is centerless, hence all higher centers equal the center. This is a case of stabilization at {{math|1=Z{{sup|1}}(''G'') = Z{{sup|2}}(''G'')}}.

==See also==
*[[Center (algebra)]]
*[[Center (ring theory)]]
*[[Centralizer and normalizer]]
*[[Conjugacy class]]

==Notes==
{{reflist|group=note}}

== References ==
* {{cite book
| last1=Fraleigh | first1=John B. | authorlink1=
| year = 2014
| title = A First Course in Abstract Algebra
| edition = 7
| publisher = Pearson
| isbn = 978-1-292-02496-7
}}

==External links==
* {{springer|title=Centre of a group|id=p/c021250}}

[[Category:Group theory]]
[[Category:Functional subgroups]]

Commesse

2023-05-08T01:22:00Z

Magmalex: corrected the English translation of the title

{{Infobox television
| image = Commesse poster.jpg
| caption =
| camera =
| runtime =
| creator =
| starring =
| country = [[Italy]]
| network =
| first_aired =
| last_aired =
| num_seasons =
| num_episodes =
| list_episodes =
}}

'''''Commesse (sales associates)''''' is an Italian television comedy drama series directed by [[Giorgio Capitani]] and broadcast by [[Rai 1]] between 1999 and 2002.

== Cast ==
* [[Sabrina Ferilli]]: Marta De Santis
* [[Nancy Brilli]]: Roberta Ardenzi
* [[Veronica Pivetti]]: Fiorenza
* [[Franco Castellano]]: Romeo
* [[Caterina Vertova]]: Francesca Carraro
* [[Anna Valle]]: Paola
* [[Elodie Treccani]]: Lucia Manca
* [[Lorenzo Ciompi]]: Dottor Livata
* [[Giacomo Piperno]]: Dante, padre di Fiorenza
* [[Giuliana Calandra]]: Anna, madre di Fiorenza
* [[Rodolfo Bigotti]]: Giancarlo De Santis, marito di Marta
* [[Massimo Ciavarro]]: Architetto Riccardo Jesi
* [[Gigliola Cinquetti]]: Clara Massimi
* [[Ray Lovelock (actor)|Ray Lovelock]]: Luca Massimi
* [[Caterina Deregibus]]: Elisa (season 2)
* [[Marco Bonini]]: Tommaso (season 2)
* [[Massimo Ghini]]: Avvocato Giovanni Minardi (season 2)
* [[Cesare Bocci]]: Gianni, fidanzato di Fiorenza (season 2)

==See also==
*[[List of Italian television series]]

==External links==
* {{IMDb title}}

{{Rai original series}}
{{Authority Control}}

[[Category:Italian television series]]
[[Category:RAI original programming]]


{{Italy-tv-prog-stub}}

Real projective plane

2022-04-08T12:09:16Z

Magmalex: Clarified construction of Moebious strip

{{short description|Compact non-orientable two-dimensional manifold}}
{| class=wikitable align=right
| valign=top width=120|[[File:ProjectivePlaneAsSquare.svg|120px]] The [[fundamental polygon]] of the projective plane.
| valign=top width=120 style="padding-top:8px"|[[File:MöbiusStripAsSquare.svg|102px|center]] <div style="margin-top:6px;">The [[Möbius strip]] with a single edge, can be closed into a projective plane by gluing opposite open edges together.</div>
| valign=top width=120|[[File:KleinBottleAsSquare.svg|120px]] In comparison, the [[Klein bottle]] is a Möbius strip closed into a cylinder.
|}
In [[mathematics]], the '''real [[projective plane]]''' is an example of a compact non-[[Orientability|orientable]] two-dimensional [[manifold]]; in other words, a one-sided [[Surface (topology)|surface]]. It cannot be [[embedding|embedded]] in standard three-dimensional space without intersecting itself. It has basic applications to [[geometry]], since the common construction of the real projective plane is as the space of lines in '''R'''3 passing through the origin.

The plane is also often described topologically, in terms of a construction based on the [[Möbius strip]]: if one could glue the (single) edge of the Möbius strip to itself in the correct direction, one would obtain the [[projective plane]]. (This cannot be done in three-dimensional space without the surface intersecting itself.) Equivalently, gluing a disk along the boundary of the Möbius strip gives the projective plane. Topologically, it has [[Euler characteristic]] 1, hence a [[Genus (mathematics)|demigenus]] (non-orientable genus, Euler genus) of 1.

Since the Möbius strip, in turn, can be constructed from a [[Square (geometry)|square]] by gluing two of its sides together with a half-twist, the ''real'' projective plane can thus be represented as a unit square (that is, [0, 1] [[Cartesian product|×]] [0,1] ) with its sides identified by the following [[equivalence relation]]s:
:(0, ''y'') ~ (1, 1 − ''y'')   for 0 ≤ ''y'' ≤ 1

and
:(''x'', 0) ~ (1 − ''x'', 1)   for 0 ≤ ''x'' ≤ 1,

as in the leftmost diagram shown here.

== Examples ==
Projective geometry is not necessarily concerned with curvature and the real projective plane may be twisted up and placed in the Euclidean plane or 3-space in many different ways.<ref name="apery">Apéry, F.; ''Models of the real projective plane'', Vieweg (1987)</ref> Some of the more important examples are described below.

The projective plane cannot be [[embedding|embedded]] (that is without intersection) in three-dimensional [[Euclidean space]]. The proof that the projective plane does not embed in three-dimensional Euclidean space goes like this: Assuming that it does embed, it would bound a compact region in three-dimensional Euclidean space by the [[Jordan curve theorem|generalized Jordan curve theorem]]. The outward-pointing unit normal vector field would then give an [[Orientation (mathematics)|orientation]] of the boundary manifold, but the boundary manifold would be the [[projective plane]], which is not orientable. This is a contradiction, and so our assumption that it does embed must have been false.

===The projective sphere===
Consider a [[sphere]], and let the [[great circle]]s of the sphere be "lines", and let pairs of [[antipodal point]]s be "points". It is easy to check that this system obeys the axioms required of a [[projective plane]]:

*any pair of distinct great circles meet at a pair of antipodal points; and
*any two distinct pairs of antipodal points lie on a single great circle.

If we identify each point on the sphere with its antipodal point, then we get a representation of the real projective plane in which the "points" of the projective plane really are points. This means that the projective plane is the quotient space of the sphere obtained by partitioning the sphere into equivalence classes under the equivalence relation ~, where x ~ y [[Conditional clause|if]] y = x or y = −x. This quotient space of the sphere is [[homeomorphic]] with the collection of all lines passing through the origin in '''R'''3.

The quotient map from the sphere onto the real projective plane is in fact a two sheeted (i.e. two-to-one) [[covering map]]. It follows that the [[fundamental group]] of the real projective plane is the cyclic group of order 2; i.e., integers modulo 2. One can take the loop ''AB'' from the figure above to be the generator.

===The projective hemisphere===
Because the sphere covers the real projective plane twice, the plane may be represented as a closed hemisphere around whose rim opposite points are similarly identified.<ref>Weeks, J.; ''The shape of space'', CRC (2002), p 59</ref>

===Boy's surface – an immersion===
The projective plane can be [[Immersion (mathematics)|immersed]] (local neighbourhoods of the source space do not have self-intersections) in 3-space. [[Boy's surface]] is an example of an immersion.

Polyhedral examples must have at least nine faces.<ref>Brehm, U.; "How to build minimal polyhedral models of the Boy surface", ''The mathematical intelligencer'' '''12''', No. 4 (1990), pp 51-56.</ref>

===Roman surface===
[[Image:Steiner's Roman Surface.gif|thumb|An animation of the Roman Surface]]
Steiner's [[Roman surface]] is a more degenerate map of the projective plane into 3-space, containing a [[cross-cap]].

[[File:Tetrahemihexahedron.png|thumb|The [[tetrahemihexahedron]] is a polyhedral representation of the real projective plane.]]
A [[Polyhedron|polyhedral]] representation is the [[tetrahemihexahedron]],<ref name="richter">{{Harv|Richter}}</ref> which has the same general form as Steiner's Roman Surface, shown here.

===Hemi polyhedra===
Looking in the opposite direction, certain [[abstract regular polytope]]s – [[Hemicube (geometry)|hemi-cube]], [[hemi-dodecahedron]], and [[hemi-icosahedron]] – can be constructed as regular figures in the ''projective plane;'' see also [[projective polyhedra]].

===Planar projections===
Various planar (flat) projections or mappings of the projective plane have been described. In 1874 Klein described the mapping:<ref name="apery" />
: <math>k (x, y) = \left(1 + x^2 + y^2\right)^\frac{1}{2} \begin{pmatrix}x \\ y\end{pmatrix}</math>

Central projection of the projective hemisphere onto a plane yields the usual infinite projective plane, described below.

===Cross-capped disk===
A closed surface is obtained by gluing a [[disk (mathematics)|disk]] to a [[cross-cap]]. This surface can be represented parametrically by the following equations:
:<math>\begin{align}
X(u,v) &= r \, (1 + \cos v) \, \cos u, \\
Y(u,v) &= r \, (1 + \cos v) \, \sin u, \\
Z(u,v) &= -\operatorname{tanh}\left(u - \pi \right) \, r \, \sin v,
\end{align}</math>

where both ''u'' and ''v'' range from 0 to 2''π''.

These equations are similar to those of a [[torus]]. Figure 1 shows a closed cross-capped disk.

{|
| [[Image:CrossCapTwoViews.PNG|500px]]
|-
| align=center | Figure 1. Two views of a cross-capped disk.
|}

A cross-capped disk has a [[plane of symmetry]] which passes through its line segment of double points. In Figure 1 the cross-capped disk is seen from above its plane of symmetry ''z'' = 0, but it would look the same if seen from below.

A cross-capped disk can be sliced open along its plane of symmetry, while making sure not to cut along any of its double points. The result is shown in Figure 2.

{|
| [[Image:CrossCapSlicedOpen.PNG|500px]]
|-
| align=center | Figure 2. Two views of a cross-capped disk which has been sliced open.
|}

Once this exception is made, it will be seen that the sliced cross-capped disk is [[homeomorphism|homeomorphic]] to a self-intersecting disk, as shown in Figure 3.

{|
| [[Image:SelfIntersectingDisk.PNG|500px]]
|-
| align=center | Figure 3. Two alternative views of a self-intersecting disk.
|}

The self-intersecting disk is homeomorphic to an ordinary disk. The parametric equations of the self-intersecting disk are:
:<math>\begin{align}
X(u, v) &= r \, v \, \cos 2u, \\
Y(u, v) &= r \, v \, \sin 2u, \\
Z(u, v) &= r \, v \, \cos u,
\end{align}</math>

where ''u'' ranges from 0 to 2''π'' and ''v'' ranges from 0 to 1.

Projecting the self-intersecting disk onto the plane of symmetry (''z'' = 0 in the parametrization given earlier) which passes only through the double points, the result is an ordinary disk which repeats itself (doubles up on itself).

The plane ''z'' = 0 cuts the self-intersecting disk into a pair of disks which are mirror [[Reflection (mathematics)|reflection]]s of each other. The disks have centers at the [[Origin (mathematics)|origin]].

Now consider the rims of the disks (with ''v'' = 1). The points on the rim of the self-intersecting disk come in pairs which are reflections of each other with respect to the plane ''z'' = 0.

A cross-capped disk is formed by identifying these pairs of points, making them equivalent to each other. This means that a point with parameters (''u'', 1) and coordinates <math>(r \, \cos 2u, r \, \sin 2u, r \, \cos u)</math> is identified with the point (''u'' + π, 1) whose coordinates are <math> (r \, \cos 2 u, r \, \sin 2 u, - r \, \cos u) </math>. But this means that pairs of opposite points on the rim of the (equivalent) ordinary disk are identified with each other; this is how a real projective plane is formed out of a disk. Therefore, the surface shown in Figure 1 (cross-cap with disk) is topologically equivalent to the real projective plane ''RP''2.

==Homogeneous coordinates==
{{main|Homogeneous coordinates}}
The points in the plane can be represented by [[homogeneous coordinates]]. A point has homogeneous coordinates [''x'' : ''y'' : ''z''], where the coordinates [''x'' : ''y'' : ''z''] and [''tx'' : ''ty'' : ''tz''] are considered to represent the same point, for all nonzero values of ''t''. The points with coordinates [''x'' : ''y'' : 1] are the usual [[real plane]], called the '''finite part''' of the projective plane, and points with coordinates [''x'' : ''y'' : 0], called '''points at infinity''' or '''ideal points''', constitute a line called the '''[[line at infinity]]'''. (The homogeneous coordinates [0 : 0 : 0] do not represent any point.)

The lines in the plane can also be represented by homogeneous coordinates. A projective line corresponding to the plane {{nowrap|''ax'' + ''by'' + ''cz'' {{=}} 0}} in '''R'''3 has the homogeneous coordinates (''a'' : ''b'' : ''c''). Thus, these coordinates have the equivalence relation (''a'' : ''b'' : ''c'') = (''da'' : ''db'' : ''dc'') for all nonzero values of ''d''. Hence a different equation of the same line ''dax'' + ''dby'' + ''dcz'' = 0 gives the same homogeneous coordinates.
A point [''x'' : ''y'' : ''z''] lies on a line (''a'' : ''b'' : ''c'') if ''ax'' + ''by'' + ''cz'' = 0.
Therefore, lines with coordinates (''a'' : ''b'' : ''c'') where ''a'', ''b'' are not both 0 correspond to the lines in the usual [[real plane]], because they contain points that are not at infinity. The line with coordinates (0 : 0 : 1) is the line at infinity, since the only points on it are those with ''z'' = 0.

===Points, lines, and planes===
[[Image:Proj geom1.PNG|right]]
A line in '''P'''2 can be represented by the equation ''ax'' + ''by'' + ''cz'' = 0. If we treat ''a'', ''b'', and ''c'' as the column vector '''ℓ''' and ''x'', ''y'', ''z'' as the column vector '''x''' then the equation above can be written in matrix form as:

:'''x'''T'''ℓ''' = 0 or '''ℓ'''T'''x''' = 0.

Using vector notation we may instead write '''x''' ⋅ '''ℓ''' = 0 or '''ℓ''' ⋅ '''x''' = 0.

The equation ''k''('''x'''T'''ℓ''') = 0 (which k is a non-zero scalar) sweeps out a plane that goes through zero in '''R'''3 and ''k''(''x'') sweeps out a line, again going through zero. The plane and line are [[linear subspace]]s in [[real coordinate space|'''R'''3]], which always go through zero.
{{Clear}}

===Ideal points===
[[Image:prj geom.svg|right]]

In '''P'''2 the equation of a line is {{nowrap|''ax'' + ''by'' + ''cz'' {{=}} 0}} and this equation can represent a line on any plane parallel to the ''x'', ''y'' plane by multiplying the equation by ''k''.

If {{nowrap|''z'' {{=}} 1}} we have a normalized homogeneous coordinate. All points that have ''z'' = 1 create a plane. Let's pretend we are looking at that plane (from a position further out along the ''z'' axis and looking back towards the origin) and there are two parallel lines drawn on the plane. From where we are standing (given our visual capabilities) we can see only so much of the plane, which we represent as the area outlined in red in the diagram. If we walk away from the plane along the ''z'' axis, (still looking backwards towards the origin), we can see more of the plane. In our field of view original points have moved. We can reflect this movement by dividing the homogeneous coordinate by a constant. In the adjacent image we have divided by 2 so the ''z'' value now becomes 0.5. If we walk far enough away what we are looking at becomes a point in the distance. As we walk away we see more and more of the parallel lines. The lines will meet at a line at infinity (a line that goes through zero on the plane at {{nowrap|''z'' {{=}} 0}}). Lines on the plane when {{nowrap|''z'' {{=}} 0}} are ideal points. The plane at {{nowrap|''z'' {{=}} 0}} is the line at infinity.

The homogeneous point {{nowrap|(0, 0, 0)}} is where all the real points go when you're looking at the plane from an infinite distance, a line on the {{nowrap|''z'' {{=}} 0}} plane is where parallel lines intersect.
{{Clear}}

===Duality===
[[Image:Projective geometry diagram 2.svg|200px|right]]
In the equation {{nowrap|'''x'''T'''ℓ''' {{=}} 0}} there are two [[column vector]]s. You can keep either constant and vary the other. If we keep the point '''x''' constant and vary the coefficients '''ℓ''' we create new lines that go through the point. If we keep the coefficients constant and vary the points that satisfy the equation we create a line. We look upon '''x''' as a point, because the axes we are using are ''x'', ''y'', and ''z''. If we instead plotted the coefficients using axis marked ''a'', ''b'', ''c'' points would become lines and lines would become points. If you prove something with the [[data plot]]ted on axis marked ''x'', ''y'', and ''z'' the same argument can be used for the data plotted on axis marked ''a'', ''b'', and ''c''. That is duality.
{{Clear}}

====Lines joining points and intersection of lines (using duality)====
The equation {{nowrap|'''x'''T'''ℓ''' {{=}} 0}} calculates the [[dot product|inner product]] of two column vectors. The inner product of two vectors is zero if the vectors are [[orthogonal]]. In '''P'''2, the line between the points '''x'''1 and '''x'''2 may be represented as a column vector '''ℓ''' that satisfies the equations {{nowrap|'''x'''1T'''ℓ''' {{=}} 0}} and {{nowrap|'''x'''2T'''ℓ''' {{=}} 0}}, or in other words a column vector '''ℓ''' that is orthogonal to '''x'''1 and '''x'''2. The [[cross product]] will find such a vector: the line joining two points has homogeneous coordinates given by the equation {{nowrap|'''x'''1 × '''x'''2}}. The intersection of two lines may be found in the same way, using duality, as the cross product of the vectors representing the lines, {{nowrap|'''ℓ'''1 × '''ℓ'''2}}.

==Embedding into 4-dimensional space==
The projective plane embeds into 4-dimensional Euclidean space. The real projective plane '''P'''2('''R''') is the [[Quotient space (topology)|quotient]] of the two-sphere

:'''S'''2 = {(''x'', ''y'', ''z'') ∈ '''R'''3 : ''x''2 + ''y''2 + ''z''2 = 1}

by the antipodal relation {{nowrap|(''x'', ''y'', ''z'') ~ (−''x'', −''y'', −''z'')}}. Consider the function {{nowrap|'''R'''3 → '''R'''4}} given by {{nowrap|(''x'', ''y'', ''z'') ↦ (''xy'', ''xz'', ''y''2 − ''z''2, 2''yz'')}}. This map restricts to a map whose domain is '''S'''2 and, since each component is a homogeneous polynomial of even degree, it takes the same values in '''R'''4 on each of any two antipodal points on '''S'''2. This yields a map {{nowrap|'''P'''2('''R''') → '''R'''4}}. Moreover, this map is an embedding. Notice that this embedding admits a projection into '''R'''3 which is the [[Roman surface]].

==Higher non-orientable surfaces==
By gluing together projective planes successively we get non-orientable surfaces of higher [[genus (mathematics)|demigenus]]. The gluing process consists of cutting out a little disk from each surface and identifying (''gluing'') their boundary circles. Gluing two projective planes creates the [[Klein bottle]].

The article on the [[fundamental polygon]] describes the higher non-orientable surfaces.

==See also==
*[[Real projective space]]
*[[Projective space]]
*[[Pu's inequality|Pu's inequality for real projective plane]]
*[[Smooth projective plane]]

==References==
{{reflist}}
{{refbegin}}
*Coxeter, H.S.M. (1955), ''The Real Projective Plane'', 2nd ed. Cambridge: At the University Press.
*Reinhold Baer, Linear Algebra and Projective Geometry, Dover, 2005 ({{isbn|0-486-44565-8}} )
* {{citation | first = David A. | last = Richter | url = http://homepages.wmich.edu/~drichter/rptwo.htm | title = Two Models of the Real Projective Plane | access-date = 2010-04-15 }}
{{refend}}

== External links ==
* {{MathWorld|urlname=RealProjectivePlane|title=Real Projective Plane}}
* [http://vis.cs.brown.edu/docs/pdf/Demiralp-2009-CLF.pdf Line field coloring using Werner Boy's real projective plane immersion]
* [https://www.youtube.com/watch?v=lDqmaPEjJpk The real projective plane on YouTube]

{{Compact topological surfaces}}

[[Category:Surfaces]]
[[Category:Geometric topology]]

Robert Fano

2020-05-02T10:16:12Z

Magmalex: /* Early life and education */ corrected a typo

{{short description|Italian-American computer scientist}}
{{Infobox scientist
| image = Robert Fano 2012-03-13.jpg
| image_size =
| alt =
| caption = Prof. Fano in his office at MIT in 2012
| birth_name = Roberto Mario Fano
| birth_date = {{Birth date|df=yes|1917|11|11}}
| birth_place = [[Turin]], Italy
| death_date = {{Death date and age|df=yes|2016|07|13|1917|11|11}}
| death_place = [[Naples, Florida]]
| citizenship = United States of America
| fields = [[computer science]], [[information theory]]
| workplaces = [[Massachusetts Institute of Technology]]
| alma_mater = MIT
| thesis_title = Theoretical Limitations on the Broadband Matching of Arbitrary Impedances
| thesis_year = 1947
| doctoral_advisor = [[Ernst Guillemin]]
| known_for = [[Shannon–Fano coding]], founder of [[Project MAC]]
| awards = [[IEEE James H. Mulligan Jr. Education Medal]] (1977) [[Claude E. Shannon Award|Shannon Award]] (1976) [[IEEE Fellow]] (1954)
}}
'''Roberto Mario "Robert" Fano''' (11 November 1917 – 13 July 2016) was an Italian-American computer scientist and professor of [[electrical engineering]] and [[computer science]] at the [[Massachusetts Institute of Technology]].<ref>{{cite news|url=https://www.nytimes.com/2008/03/13/world/europe/13weizenbaum.html|title=Joseph Weizenbaum Dies; Computer Pioneer Was 85|last=Markoff|first=John|date=13 March 2008|work=The New York Times|page=22|accessdate=15 August 2011}}</ref>

==Early life and education==
Fano was born in Turin, Italy in 1917<ref name="Seising2007">{{cite book|last=Seising|first=Rudolf|title=Fuzzification of systems: the genesis of fuzzy set theory and its initial applications - developments up to the 1970s|url=https://books.google.com/books?id=rdYRdlM2dAQC&pg=PA33|accessdate=15 August 2011|date=2007-08-08|publisher=Springer|isbn=978-3-540-71794-2|pages=33–}}</ref><ref>{{cite web|title=United States Public Records Index|url=https://familysearch.org/pal:/MM9.1.1/KLDP-WJX|publisher=FamilySearch|accessdate=9 August 2013}}</ref> to a [[History of the Jews in Italy|Jewish]] family and grew up in Turin.<ref>[https://opinionator.blogs.nytimes.com/2011/06/23/did-my-brother-invent-e-mail-with-tom-van-vleck-part-five/ Did My Brother Invent E-Mail With Tom Van Vleck? (Part Five)]
BY ERROL MORRIS JUNE 23, 2011, New York Times</ref> Fano's father was the mathematician [[Gino Fano]], his older brother was the physicist [[Ugo Fano]], and [[Giulio Racah]] was a cousin.<ref>{{cite book|title=The New York Times biographical service|url=https://books.google.com/books?id=_HkoAQAAIAAJ|year=2001|publisher=New York Times & Arno Press|pages=297}}</ref> Fano studied engineering as an [[undergraduate]] at the School of Engineering of Torino (Politecnico di Torino) until 1939, when he emigrated to the United States as a result of anti-Jewish legislation passed under [[Benito Mussolini]].<ref>{{cite web | authorlink = Errol Morris | last = Morris | first = Errol | title = Did My Brother Invent E-Mail With Tom Van Vleck? (Part Five) | url = http://opinionator.blogs.nytimes.com/2011/06/23/did-my-brother-invent-e-mail-with-tom-van-vleck-part-five/#more-96615 | accessdate = 2012-03-14 | date = 23 June 2011 | work = Opinionator | publisher = [[The New York Times]]}}</ref> He received his [[bachelor's degree|S.B.]] in electrical engineering from MIT in 1941, and upon graduation joined the staff of the [[MIT Radiation Laboratory]]. After World War II, Fano continued on to complete his [[Sc.D.]] in electrical engineering from MIT in 1947. His thesis, titled "Theoretical Limitations on the Broadband Matching of Arbitrary Impedances",<ref>{{cite web | title = Theoretical Limitations on the Broadband Matching of Arbitrary Impedances - MIT Technical Report no. 41| url = http://www.ieeeghn.org/wiki/images/temp/7/7b/20090630194833!HOBIM_Fano_MIT_TR41.pdf | accessdate = 2013-05-18 | date = 2 January 1948| publisher = MIT Research Laboratory of Electronics}}</ref> was supervised by [[Ernst Guillemin]].

==Career==
Fano's career spans three areas, microwave systems, information theory, and computer science.

Fano joined the MIT faculty in 1947 to what was then called the Department of Electrical Engineering. Between 1950 and 1953, he led the Radar Techniques Group at [[Lincoln Laboratory]].<ref name="Lee1995"/> In 1954, Fano was made an [[IEEE Fellow]] for "contributions in the field of information theory and microwave filters".<ref>{{cite web | publisher = [[Institute of Electrical and Electronics Engineers]] | url = http://www.ieee.org/membership_services/membership/fellows/alphabetical/ffellows.html | title = IEEE Fellows - F | accessdate = 2012-03-13 | url-status = dead | archiveurl = https://web.archive.org/web/20131112201704/http://www.ieee.org/membership_services/membership/fellows/alphabetical/ffellows.html | archivedate = 12 November 2013 | df = dmy-all }}</ref> He was elected to the [[American Academy of Arts and Sciences]] in 1958, to the [[National Academy of Engineering]] in 1973, and to the [[National Academy of Sciences]] in 1978.<ref name="Lee1995">{{cite book|last=Lee|first=John A. N.|title=International biographical dictionary of computer pioneers|url=https://books.google.com/books?id=ocx4Jc12mkgC&pg=PA296|accessdate=15 August 2011|year=1995|publisher=Taylor & Francis US|isbn=978-1-884964-47-3|pages=296–}}</ref><ref>Dates of election per the [http://www.amacad.org/members.aspx American Academy] and [http://www.nationalacademies.org/memarea/ National Academies] membership lists.</ref>

Fano was known principally for his work on [[information theory]]. He developed [[Shannon–Fano coding]]<ref name="Salomon2007">{{cite book|last=Salomon|first=David|title=Data compression: the complete reference|url=https://books.google.com/books?id=ujnQogzx_2EC&pg=PA72|accessdate=15 August 2011|year=2007|publisher=Springer|isbn=978-1-84628-602-5|pages=72–}}</ref> in collaboration with [[Claude Shannon]], and derived the [[Fano inequality]]. He also invented the [[Sequential decoding|Fano algorithm]] and postulated the [[Sequential decoding|Fano metric]].<ref>{{Cite journal
| last = Fano
| first = Robert M.
| date = April 1963
| title = A heuristic discussion of probabilistic decoding
| journal = IEEE Transactions on Information Theory
| volume=9 |issue=2 |pages=64–73
| doi = 10.1109/tit.1963.1057827
}}</ref>

In the early 1960s, Fano was involved in the development of [[time-sharing]] computers. From 1963 until 1968 Fano served as the founding director of MIT's [[Project MAC]], which evolved to become what is now known as the [[MIT Computer Science and Artificial Intelligence Laboratory]].<ref name="WildesLindgren1985">{{cite book|last1=Wildes|first1=Karl L.|last2=Lindgren|first2=Nilo A.|title=A century of electrical engineering and computer science at MIT, 1882-1982|url=https://archive.org/details/centuryofelectri0000wild|url-access=registration|accessdate=15 August 2011|year=1985|publisher=MIT Press|isbn=978-0-262-23119-0|pages=[https://archive.org/details/centuryofelectri0000wild/page/348 348]–}}</ref><ref name="BelzerHolzman1979">{{cite book|last1=Belzer|first1=Jack|last2=Holzman|first2=Albert G.|last3=Kent|first3=Allen|title=Encyclopedia of computer science and technology: Pattern recognition to reliability of computer systems|url=https://books.google.com/books?id=IFmaqTI9-KsC&pg=PA339|accessdate=15 August 2011|date=1979-05-01|publisher=CRC Press|isbn=978-0-8247-2262-3|pages=339–}}</ref> He also helped to create MIT's original computer science curriculum.

In 1967, Fano received the [[Claude E. Shannon Award]] for his work in information theory.<ref name=Lee1995/> In 1977 he was recognized for his contribution to the teaching of electrical engineering with the IEEE James H. Mulligan Jr. Education Medal.<ref>{{cite web|url=http://www.ieee.org/documents/education_rl.pdf |title=IEEE James H. Mulligan Jr. Education Medal Recipients |publisher=IEEE |accessdate=December 9, 2014}}</ref>

Fano retired from active teaching in 1984,<ref name="mit-obit"/> and died on 13 July 2016 at the age of 98.<ref name="mit-obit">{{cite web|title=Robert Fano, computing pioneer and founder of CSAIL, dies at 98|url=https://news.mit.edu/2016/robert-fano-obituary-0715|publisher=MIT News Office|author1=Conner-Simons, Adam|author2=Gordon, Rachel|date=July 15, 2016|accessdate=15 July 2016|url-status=dead|archiveurl=https://web.archive.org/web/20160716123820/http://news.mit.edu/2016/robert-fano-obituary-0715|archivedate=16 July 2016|df=dmy-all}}</ref>

==Bibliography==
In addition to his work in information theory, Fano also published articles and books about microwave systems,<ref name="Lee2004">{{cite book|last=Lee|first=Thomas H.|title=Planar microwave engineering: a practical guide to theory, measurement, and circuits|url=https://books.google.com/books?id=uoj3IWFxbVYC&pg=PA93|accessdate=15 August 2011|year=2004|publisher=Cambridge University Press|isbn=978-0-521-83526-8|pages=93–}}</ref> electromagnetism, network theory, and engineering education. His longer publications include:

*"The Theory of Microwave Filters" and "The Design of Microwave Filters", chapters 9 and 10 in George L. Ragan, ed., ''Microwave Transmission Circuits'', vol. 9 in the ''Radiation Laboratory Series'' (with A. W. Lawson, 1948).
*''Electromagnetic Energy Transmission and Radiation'' (with [[Lan Jen Chu]] and Richard B. Adler, 1960).
*''Electromagnetic Fields, Energy, and Forces'' (with Chu and Adler, 1960).
* {{cite book | last=Fano | first=Robert | title=Transmission of information: a statistical theory of communications | publisher=MIT Press | location=Cambridge, Mass | year=1966 | url=https://archive.org/details/TransmissionOfInformationAStatisticalTheoryOfCommunicationRobertFano| isbn=978-0-262-56169-3 | oclc=804123877 | ref=harv}}

==References==
{{Reflist|2}}

==External links==
{{Commons category}}
* [http://purl.umn.edu/107281 Oral history interview with Robert M. Fano] 20 April 1989. [[Charles Babbage Institute]] University of Minnesota. Fano discusses his move to computer science from information theory and his interaction with the [[Advanced Research Projects Agency]] (ARPA). Topics include: computing research at the Massachusetts Institute of Technology (MIT); the work of J.C.R. Licklider at the [[Information Processing Techniques Office]] of ARPA; time-sharing and computer networking research; Project MAC; computer science education; CTSS development; [[System Development Corporation]] (SDC); the development of ARPANET; and a comparison of ARPA, National Science Foundation, and Office of Naval Research computer science funding.
* {{YouTube|sjnmcKVnLi0|Video of Robert Fano}} from 1964, demonstrating the [[Compatible Time-Sharing System]] (CTSS).
* {{MathGenealogy|id=64848}}

{{IEEE James H. Mulligan, Jr. Education Medal}}
{{Claude E. Shannon Award recipients}}
{{Authority control}}

{{Use dmy dates|date=August 2011}}

{{DEFAULTSORT:Fano, Robert}}
[[Category:1917 births]]
[[Category:2016 deaths]]
[[Category:American computer scientists]]
[[Category:Italian computer scientists]]
[[Category:American information theorists]]
[[Category:Italian information theorists]]
[[Category:Jewish American scientists]]
[[Category:Members of the United States National Academy of Engineering]]
[[Category:Members of the United States National Academy of Sciences]]
[[Category:American people of Italian-Jewish descent]]
[[Category:Italian refugees]]
[[Category:Italian Jews]]
[[Category:People from Turin]]
[[Category:20th-century American engineers]]
[[Category:20th-century American scientists]]
[[Category:Massachusetts Institute of Technology alumni]]
[[Category:Massachusetts Institute of Technology faculty]]
[[Category:Fellow Members of the IEEE]]
[[Category:Fellows of the American Academy of Arts and Sciences]]
[[Category:MIT Lincoln Laboratory people]]

Generating set of a module

2020-04-26T00:46:41Z

Magmalex: Just clarified the subject of a sentence

In [[abstract algebra|algebra]], a '''generating set''' ''G'' of a [[module (mathematics)|module]] ''M'' over a [[ring (mathematics)|ring]] ''R'' is a subset of ''M'' such that the smallest submodule of ''M'' containing ''G'' is ''M'' itself (the smallest submodule containing a subset is the intersection of all submodules containing the set). The set ''G'' is then said to generate ''M''. For example, the ring ''R'' is generated by the identity element 1 as a left ''R''-module over itself. If there is a finite generating set, then a module is said to be [[finitely generated module|finitely generated]].

Explicitly, if ''G'' is a generating set of a module ''M'', then every element of ''M'' is a (finite) ''R''-linear combination of some elements of ''G''; i.e., for each ''x'' in ''M'', there are ''r''1, ..., ''r''''m'' in ''R'' and ''g''1, ..., ''g''''m'' in ''G'' such that

: <math> x = r_1 g_1 + \cdots + r_m g_m. </math>

Put in another way, there is a surjection

: <math> \bigoplus_{g \in G} R \to M, \, r_g \mapsto r_g g,</math>

where we wrote ''r''''g'' for an element in the ''g''-th component of the direct sum. (Coincidentally, since a generating set always exists; for example, ''M'' itself, this shows that a module is a quotient of a free module, a useful fact.)

A generating set of a module is said to be '''minimal''' if no proper subset of the set generates the module. If ''R'' is a [[field (mathematics)|field]], then a generating set is the same thing as a [[basis (linear algebra)|basis]]. Unless the module is [[finitely-generated module|finitely-generated]], there may exist no minimal generating set.<ref>{{cite web|url=https://mathoverflow.net/q/33540 |title=ac.commutative algebra – Existence of a minimal generating set of a module – MathOverflow|work=mathoverflow.net}}</ref>

The cardinality of a minimal generating set need not be an invariant of the module; '''Z''' is generated as a principal ideal by 1, but it is also generated by, say, a minimal generating set {{nowrap|{ 2, 3 }}}. What is uniquely determined by a module is the [[infimum]] of the numbers of the generators of the module.

Let ''R'' be a local ring with maximal ideal ''m'' and residue field ''k'' and ''M'' finitely generated module. Then [[Nakayama's lemma]] says that ''M'' has a minimal generating set whose cardinality is <math>\dim_k M / mM = \dim_k M \otimes_R k</math>. If ''M'' is flat, then this minimal generating set is [[linearly independent]] (so ''M'' is free). See also: [[minimal resolution (algebra)|minimal resolution]].

A more refined information is obtained if one considers the relations between the generators; cf. [[free presentation of a module]].

== See also ==
*[[Countably generated module]]
*[[Flat module]]
*[[Invariant basis number]]

== References ==
{{reflist}}
*Dummit, David; Foote, Richard. ''Abstract Algebra''.

[[Category:Abstract algebra]]

{{algebra-stub}}

Characteristic subgroup

2020-04-24T04:35:38Z

Magmalex: /* Definition */ added definition in mathematical notation

{{Short description|subgroup mapped to itself under every automorphism of the parent group}}
In [[mathematics]], particularly in the area of [[abstract algebra]] known as [[group theory]], a '''characteristic subgroup''' is a [[subgroup]] that is mapped to itself by every [[automorphism]] of the parent [[group (mathematics)|group]].<ref>{{cite book | last1=Dummit | first1=David S. | last2=Foote | first2=Richard M. | title=Abstract Algebra | publisher=[[John Wiley & Sons]] | year=2004 | edition=3rd | isbn=0-471-43334-9}}</ref><ref>{{cite book | last=Lang | first=Serge | authorlink=Serge Lang | title=Algebra | publisher=[[Springer Science+Business Media|Springer]] | series=[[Graduate Texts in Mathematics]] | year=2002 | isbn=0-387-95385-X}}</ref> Because every [[conjugation map]] is an [[inner automorphism]], every characteristic subgroup is [[normal subgroup|normal]]; though the converse is not guaranteed. Examples of characteristic subgroups include the [[commutator subgroup]] and the [[center of a group]].

== Definition ==
A subgroup {{math|''H''}} of a group {{math|''G''}} is called a '''characteristic subgroup''', {{math|''H'' char ''G''}}, if for every automorphism {{math|''φ''}} of {{math|''G''}}, {{math|φ[''H''] ≤ ''H''}} holds, i.e. if every automorphism of the parent group maps the subgroup to within itself:
:{{math|∀φ ∈ Aut(''G'')： φ[''H''] ≤ ''H''}}.

Given {{math|''H'' char ''G''}}, every automorphism of {{math|''G''}} induces an automorphism of the quotient group, {{math|''G/H''}}, which yields a map {{math|Aut(''G'') → Aut(''G''/''H'')}}.

If {{math|''G''}} has a unique subgroup {{math|''H''}} of a given (finite) index, then {{math|''H''}} is characteristic in {{math|''G''}}.

== Related concepts ==

=== Normal subgroup ===
{{main|Normal subgroup}}
A subgroup of {{math|''H''}} that is invariant under all inner automorphisms is called [[normal subgroup|normal]]; also, an invariant subgroup.
:{{math|∀φ ∈ Inn(''G'')： φ[''H''] ≤ ''H''}}

Since {{math|Inn(''G'') ⊆ Aut(''G'')}} and a characteristic subgroup is invariant under all automorphisms, every characteristic subgroup is normal. However, not every normal subgroup is characteristic. Here are several examples:
* Let {{math|''H''}} be a nontrivial group, and let {{math|''G''}} be the [[direct product of groups|direct product]], {{math|''H'' × ''H''}}. Then the subgroups, {{math|{1} × ''H''}} and {{math|''H'' × {1{{)}}}}, are both normal, but neither is characteristic. In particular, neither of these subgroups is invariant under the automorphism, {{math|(''x'', ''y'') → (''y'', ''x'')}}, that switches the two factors.
* For a concrete example of this, let {{math|''V''}} be the [[Klein four-group]] (which is [[group isomorphism|isomorphic]] to the direct product, {{math|[[cyclic group|ℤ{{sub|2}}]] × ℤ{{sub|2}}}}). Since this group is [[abelian group|abelian]], every subgroup is normal; but every permutation of the 3 non-identity elements is an automorphism of {{math|''V''}}, so the 3 subgroups of order 2 are not characteristic. Here {{math|V {{=}} {''e'', ''a'', ''b'', ''ab''} }}. Consider {{math|H {{=}} {''e'', ''a''{{)}}}} and consider the automorphism, {{math|T(''e'') {{=}} ''e'', T(''a'') {{=}} ''b'', T(''b'') {{=}} ''a'', T(''ab'') {{=}} ''ab''}}; then {{math|T(''H'')}} is not contained in {{math|''H''}}.
* In the [[quaternion group]] of order 8, each of the cyclic subgroups of order 4 is normal, but none of these are characteristic. However, the subgroup, {{math|{1, −1{{)}}}}, is characteristic, since it is the only subgroup of order 2.
* If {{math|''n''}} is even, the [[dihedral group]] of order {{math|2''n''}} has 3 subgroups of [[index of a subgroup|index]] 2, all of which are normal. One of these is the cyclic subgroup, which is characteristic. The other two subgroups are dihedral; these are permuted by an [[outer automorphism group|outer automorphism]] of the parent group, and are therefore not characteristic.

=== Strictly characteristic subgroup{{anchor|Strictly invariant subgroup}} ===
A ''{{vanchor|strictly characteristic subgroup}}'', or a ''{{vanchor|distinguished subgroup}}'', which is invariant under [[surjective]] [[endomorphism]]s. For [[finite group]]s, surjectivity of an endomorphism implies injectivity, so a surjective endomorphism is an automorphism; thus being ''strictly characteristic'' is equivalent to ''characteristic''. This is not the case anymore for infinite groups.

=== Fully characteristic subgroup{{anchor|Fully invariant subgroup}} ===
For an even stronger constraint, a ''fully characteristic subgroup'' (also, ''fully invariant subgroup''; cf. invariant subgroup), {{math|''H''}}, of a group, {{math|''G''}} is a group remaining [[invariant (mathematics)|invariant]] under every endomorphism of {{math|''G''}}; that is,
:{{math|∀φ ∈ End(''G'')： φ[''H''] ≤ ''H''}}.

Every group has itself (the improper subgroup) and the trivial subgroup as two of its fully characteristic subgroups. The [[commutator subgroup]] of a group is always a fully characteristic subgroup.<ref>
{{cite book
| title = Group Theory
| first = W.R. | last = Scott
| pages = 45–46
| publisher = Dover
| year = 1987
| isbn = 0-486-65377-3
}}</ref><ref>
{{cite book
| title = Combinatorial Group Theory
| first1 = Wilhelm | last1 = Magnus
| first2 = Abraham | last2 = Karrass
| first3 = Donald | last3 = Solitar
| publisher = Dover
| year = 2004
| pages = 74–85
| isbn = 0-486-43830-9
}}</ref>

Every endomorphism of {{math|''G''}} induces an endomorphism of {{math|''G/H''}}, which yields a map {{math|End(''G'') → End(''G''/''H'')}}.

=== Verbal subgroup ===
An even stronger constraint is [[verbal subgroup]], which is the image of a fully invariant subgroup of a [[free group]] under a homomorphism. More generally, any [[verbal subgroup]] is always fully characteristic. For any [[reduced free group]], and, in particular, for any [[free group]], the converse also holds: every fully characteristic subgroup is verbal.

== Transitivity ==
The property of being characteristic or fully characteristic is [[transitive relation|transitive]]; if {{math|''H''}} is a (fully) characteristic subgroup of {{math|''K''}}, and {{math|''K''}} is a (fully) characteristic subgroup of {{math|''G''}}, then {{math|''H''}} is a (fully) characteristic subgroup of {{math|''G''}}.
:{{math|''H'' char ''K'' char ''G'' ⇒ ''H'' char ''G''}}.

Moreover, while normality is not transitive, it is true that every characteristic subgroup of a normal subgroup is normal.
:{{math|''H'' char ''K'' ⊲ ''G'' ⇒ ''H'' ⊲ ''G''}}

Similarly, while being strictly characteristic (distinguished) is not transitive, it is true that every fully characteristic subgroup of a strictly characteristic subgroup is strictly characteristic.

However, unlike normality, if {{math|''H'' char ''G''}} and {{math|''K''}} is a subgroup of {{math|''G''}} containing {{math|''H''}}, then in general {{math|''H''}} is not necessarily characteristic in {{math|''K''}}.
:{{math|''H'' char ''G'', ''H'' < ''K'' < ''G'' ⇏ ''H'' char ''K''}}

== Containments ==
Every subgroup that is fully characteristic is certainly strictly characteristic and characteristic; but a characteristic or even strictly characteristic subgroup need not be fully characteristic.

The [[center of a group]] is always a strictly characteristic subgroup, but it is not always fully characteristic. For example, the finite group of order 12, {{math|Sym(3) × ℤ/2ℤ}}, has a homomorphism taking {{math|(''π'', ''y'')}} to {{math|((1, 2){{sup|''y''}}, 0)}}, which takes the center, {{math|1 × ℤ/2ℤ}}, into a subgroup of {{math|Sym(3) × 1}}, which meets the center only in the identity.

The relationship amongst these subgroup properties can be expressed as:
:[[Subgroup]] ⇐ [[Normal subgroup]] ⇐ '''Characteristic subgroup''' ⇐ Strictly characteristic subgroup ⇐ [[Fully characteristic subgroup]] ⇐ [[Verbal subgroup]]

==Examples==

=== Finite example ===
Consider the group {{math|''G'' {{=}} S{{sub|3}} × ℤ{{sub|2}}}} (the group of order 12 that is the direct product of the [[symmetric group]] of order 6 and a [[cyclic group]] of order 2). The center of {{math|''G''}} is its second factor {{math|ℤ{{sub|2}}}}. Note that the first factor, {{math|S{{sub|3}}}}, contains subgroups isomorphic to {{math|ℤ{{sub|2}}}}, for instance {{math|{e, (12)} }}; let {{math|''f'': ℤ{{sub|2}} → S{{sub|3}}}} be the morphism mapping {{math|ℤ{{sub|2}}}} onto the indicated subgroup. Then the composition of the projection of {{math|''G''}} onto its second factor {{math|ℤ{{sub|2}}}}, followed by {{math|''f''}}, followed by the inclusion of {{math|S{{sub|3}}}} into {{math|''G''}} as its first factor, provides an endomorphism of {{math|''G''}} under which the image of the center, {{math|ℤ{{sub|2}}}}, is not contained in the center, so here the center is not a fully characteristic subgroup of {{math|''G''}}.

=== Cyclic groups ===
Every subgroup of a cyclic group is characteristic.

=== Subgroup functors ===
The [[derived subgroup]] (or commutator subgroup) of a group is a verbal subgroup. The [[torsion subgroup]] of an [[abelian group]] is a fully invariant subgroup.

=== Topological groups ===
The [[identity component]] of a [[topological group]] is always a characteristic subgroup.

==See also==
* [[Characteristically simple group]]

==References==
{{reflist}}

[[Category:Subgroup properties]]

Monoidal category

2020-04-23T03:30:47Z

Magmalex: /* Formal definition */ substituted removes with cancels, since this is a sort of cancellation

{{redirect-distinguish|Internal product|Inner product}}
{{Short description|Category admitting tensor products}}In [[mathematics]], a '''monoidal category''' (or '''tensor category''') is a [[category (mathematics)|category]] <math>\mathbf C</math> equipped with a [[bifunctor]]
:<math>\otimes : \mathbf{C} \times \mathbf{C} \to \mathbf{C}</math>
that is [[associative]] [[up to]] a [[natural isomorphism]], and an [[Object (category theory)|object]] ''I'' that is both a [[left identity|left]] and [[right identity]] for ⊗, again up to a natural isomorphism. The associated natural isomorphisms are subject to certain [[coherence condition]]s, which ensure that all the relevant diagrams commute.

The ordinary [[tensor product]] makes [[vector space]]s, [[abelian group]]s, [[module (mathematics)|''R''-modules]], or [[algebra (ring theory)|''R''-algebras]] into monoidal categories. Monoidal categories can be seen as a generalization of these and other examples. Every (small) monoidal category may also be viewed as a "[[categorification]]" of an underlying [[monoid]], namely the monoid whose elements are the isomorphism classes of the category's objects and whose binary operation is given by the category's tensor product.

A rather different application, of which monoidal categories can be considered an abstraction, is that of a system of [[data type]]s closed under a [[type constructor]] that takes two types and builds an aggregate type; the types are the objects and <math>\otimes</math> is the aggregate constructor. The associativity up to isomorphism is then a way of expressing that different ways of aggregating the same data—such as <math>((a,b),c)</math> and <math>(a,(b,c))</math>—store the same information even though the aggregate values need not be the same; similarly the identity object is the [[void type]], which stores no information. The concept of monoidal category does not presume that values of such aggregate types can be taken apart; on the contrary, it provides a framework that unifies classical and [[quantum information]] theory.<ref>{{cite book |last1=Baez |first1=John |last2=Stay |first2=Mike |authorlink1=John C. Baez |chapter=Physics, topology, logic and computation: a Rosetta Stone |editor1-last=Coecke |editor1-first=Bob |title=New Structures for Physics |series=Lecture Notes in Physics |volume=813 |date=2011 |publisher=Springer, Berlin |pages=95–172 |isbn=9783642128219 |issn=0075-8450 |arxiv=0903.0340 }}</ref>

In [[category theory]], monoidal categories can be used to define the concept of a [[monoid object]] and an associated action on the objects of the category. They are also used in the definition of an [[enriched category]].

Monoidal categories have numerous applications outside of category theory proper. They are used to define models for the multiplicative fragment of [[intuitionistic logic|intuitionistic]] [[linear logic]]. They also form the mathematical foundation for the [[topological order]] in condensed matter. [[Braided monoidal category|Braided monoidal categories]] have applications in [[quantum information]], [[quantum field theory]], and [[string theory]].

==Formal definition==

A '''monoidal category''' is a category <math>\mathbf C</math> equipped with a monoidal structure. A monoidal structure consists of the following:
*a [[bifunctor]] <math>\otimes \colon \mathbf C\times\mathbf C\to\mathbf C</math> called the ''[[tensor product]]'' or ''monoidal product'',
*an object <math>I</math> called the ''unit object'' or ''identity object'',
*three [[natural isomorphism]]s subject to certain [[coherence condition]]s expressing the fact that the tensor operation
**is associative: there is a natural (in each of three arguments <math>A</math>, <math>B</math>, <math>C</math>) isomorphism <math>\alpha</math>, called ''associator'', with components <math>\alpha_{A,B,C} \colon A\otimes (B\otimes C) \cong (A\otimes B)\otimes C</math>,
**has <math>I</math> as left and right identity: there are two natural isomorphisms <math>\lambda</math> and <math>\rho</math>, respectively called ''left'' and ''right unitor'', with components <math>\lambda_A \colon I\otimes A\cong A</math> and <math>\rho_A \colon A\otimes I\cong A</math>.
:
Note that a good way to remember how <math> \lambda </math> and <math>\rho</math> act is by alliteration; ''Lambda'', <math>\lambda</math>, cancels the identity on the ''left'', while ''Rho'', <math>\rho</math>, cancels the identity on the ''right''.

The coherence conditions for these natural transformations are:
* for all <math>A</math>, <math>B</math>, <math>C</math> and <math>D</math> in <math>\mathbf C</math>, the pentagon [[diagram (category theory)|diagram]]

::[[File:Pentagonal diagram for monoidal categories.svg|center|This is one of the main diagrams used to define a monoidal category; it is perhaps the most important one.]]
: [[Commutative diagram|commutes]];
* for all <math>A</math> and <math>B</math> in <math>\mathbf C</math>, the triangle diagram
[[File:Monoidal2.svg|center|This is one of the diagrams used in the definition of a monoidal cateogory. It takes care of the case for when there is an instance of an identity between two objects.]]
: commutes;

A '''strict monoidal category''' is one for which the natural isomorphisms ''α'', ''λ'' and ''ρ'' are identities. Every monoidal category is monoidally [[equivalence of categories|equivalent]] to a strict monoidal category.

==Examples==

*Any category with finite [[product (category theory)|product]]s can be regarded as monoidal with the product as the monoidal product and the [[terminal object]] as the unit. Such a category is sometimes called a '''[[cartesian monoidal category]]'''. For example:
**'''Set''', the [[category of sets]] with the Cartesian product, any particular one-element set serving as the unit.
**'''Cat''', the category of small categories with the [[product category]], where the category with one object and only its identity map is the unit.
*Dually, any category with finite [[coproduct]]s is monoidal with the coproduct as the monoidal product and the [[initial object]] as the unit. Such a monoidal category is called '''cocartesian monoidal'''
*'''''R''-Mod''', the [[category of modules]] over a [[commutative ring]] ''R'', is a monoidal category with the [[tensor product of modules]] ⊗''R'' serving as the monoidal product and the ring ''R'' (thought of as a module over itself) serving as the unit. As special cases one has:
**'''''K''-Vect''', the [[category of vector spaces]] over a [[field (mathematics)|field]] ''K'', with the one-dimensional vector space ''K'' serving as the unit.
**'''Ab''', the [[category of abelian groups]], with the group of [[integer]]s '''Z''' serving as the unit.
*For any commutative ring ''R'', the category of [[R-algebra|''R''-algebras]] is monoidal with the [[tensor product of algebras]] as the product and ''R'' as the unit.
*The [[category of pointed spaces]] (restricted to [[compactly generated space]]s<nowiki/> for example) is monoidal with the [[smash product]] serving as the product and the pointed [[0-sphere]] (a two-point discrete space) serving as the unit.
*The category of all [[endofunctor]]s on a category '''C''' is a ''strict'' monoidal category with the composition of functors as the product and the identity functor as the unit.
*Just like for any category '''E''', the [[Subcategory#Embeddings|full subcategory]] spanned by any given object is a monoid, it is the case that for any [[2-category]] '''E''', and any object '''C''' in Ob('''E'''), the full 2-subcategory of '''E''' spanned by {'''C'''} is a monoidal category. In the case '''E''' = '''Cat''', we get the [[endofunctor]]s example above.
* [[Semilattice|Bounded-above meet semilattices]] are strict [[symmetric monoidal category|symmetric monoidal categories]]: the product is meet and the identity is the top element.
* Any ordinary monoid <math>(M,\cdot,1)</math> is a small monoidal category with object set <math>M</math>, only identities for [[morphism]]s, <math>\cdot</math> as tensorproduct and <math>1</math> as its identity object. Conversely, the set of isomorphism classes (if such a thing makes sense) of a monoidal category is a monoid w.r.t. the tensor product.

===Monoidal preorders===
{{Tone|section|date=March 2017}}
Monoidal preorders, also known as "preordered monoids", are special cases of monoidal categories. This sort of structure comes up in the theory of [[Semi-Thue system|string rewriting systems]], but it is plentiful in pure mathematics as well. For example, the set <math>\mathbb{N}</math> of [[natural numbers]] has both a [[Monoid#Examples|monoid structure]] (using + and 0) and a [[Preorder#Examples|preorder structure]] (using ≤), which together form a monoidal preorder, basically because <math>m\leq n</math> and <math>m'\leq n'</math> implies <math>m+m'\leq n+n'</math>. We now present the general case.

It's well known that a [[preorder]] can be considered as a category '''C''', such that for every two objects <math>c, c'\in\mathrm{Ob}(\mathbf{C})</math>, there exists ''at most one'' morphism <math>c\to c'</math> in '''C'''. If there happens to be a morphism from ''c'' to ''c' '', we could write <math>c\leq c'</math>, but in the current section we find it more convenient to express this fact in arrow form <math>c\to c'</math>. Because there is at most one such morphism, we never have to give it a name, such as <math>f\colon c\to c'</math>. The [[Reflexive relation|reflexivity]] and [[transitive relation|transitivity]] properties of an order are respectively accounted for by the identity morphism and the composition formula in '''C'''. We write <math>c\cong c'</math> iff <math>c\leq c'</math> and <math>c'\leq c</math>, i.e. if they are isomorphic in '''C'''. Note that in a [[partial order]], any two isomorphic objects are in fact equal.

Moving forward, suppose we want to add a monoidal structure to the preorder '''C'''. To do so means we must choose
* an object <math>I\in\mathbf{C}</math>, called the ''monoidal unit'', and
* a functor <math>\mathbf{C}\times\mathbf{C}\to\mathbf{C}</math>, which we will denote simply by the dot "<math>\;\cdot\;</math>", called the ''monoidal multiplication''.
Thus for any two objects <math>c_1, c_2</math> we have an object <math>c_1\cdot c_2</math>. We must choose <math>I</math> and <math>\cdot</math> to be associative and unital, up to isomorphism. This means we must have:
: <math>(c_1\cdot c_2)\cdot c_3 \cong c_1\cdot (c_2\cdot c_3)</math> and <math>I\cdot c \cong c\cong c\cdot I</math>.
Furthermore, the fact that · is required to be a functor means—in the present case, where '''C''' is a preorder—nothing more than the following:
:if <math>c_1\to c_1'</math> and <math>c_2\to c_2'</math> then <math>(c_1\cdot c_2)\to (c_1'\cdot c_2')</math>.
The additional coherence conditions for monoidal categories are vacuous in this case because every diagram commutes in a preorder.

Note that if '''C''' is a partial order, the above description is simplified even more, because the associativity and unitality isomorphisms becomes equalities. Another simplification occurs if we assume that the set of objects is the [[free monoid]] on a generating set <math>\Sigma</math>. In this case we could write <math>\mathrm{Ob}(\mathbf{C})=\Sigma^*</math>, where * denotes the [[Kleene star]] and the monoidal unit ''I'' stands for the empty string. If we start with a set ''R'' of generating morphisms (facts about ≤), we recover the usual notion of [[semi-Thue system]], where ''R'' is called the "rewriting rule".

To return to our example, let '''N''' be the category whose objects are the natural numbers 0, 1, 2, ..., with a single morphism <math>i\to j</math> if <math>i\leq j</math> in the usual ordering (and no morphisms from ''i'' to ''j'' otherwise), and a monoidal structure with the monoidal unit given by 0 and the monoidal multiplication given by the usual addition, <math>i\cdot j := i+j</math>. Then '''N''' is a monoidal preorder; in fact it is the one freely generated by a single object 1, and a single morphism 0 ≤ 1, where again 0 is the monoidal unit.

== Properties and associated notions ==
It follows from the three defining coherence conditions that ''a large class'' of diagrams (i.e. diagrams whose morphisms are built using <math>\alpha</math>, <math>\lambda</math>, <math>\rho</math>, identities and tensor product) commute: this is [[Saunders Mac Lane|Mac Lane's]] "[[coherence theorem]]". It is sometimes inaccurately stated that ''all'' such diagrams commute.

There is a general notion of [[monoid object]] in a monoidal category, which generalizes the ordinary notion of [[monoid]] from [[abstract algebra]]. Ordinary monoids are precisely the monoid objects in the cartesian monoidal category '''Set'''. Further, any strict monoidal category can be seen as a monoid object in the category of categories '''Cat''' (equipped with the monoidal structure induced by the cartesian product).

[[Monoidal functor]]s are the functors between monoidal categories that preserve the tensor product and [[monoidal natural transformation]]s are the natural transformations, between those functors, which are "compatible" with the tensor product.

Every monoidal category can be seen as the category '''B'''(∗, ∗) of a [[bicategory]] '''B''' with only one object, denoted ∗.

A category '''C''' [[Enriched category|enriched]] in a monoidal category '''M''' replaces the notion of a set of morphisms between pairs of objects in '''C''' with the notion of an '''M'''-object of morphisms between every two objects in '''C'''.

=== Free strict monoidal category ===

For every category '''C''', the [[free category|free]] strict monoidal category Σ('''C''') can be constructed as follows:
* its objects are lists (finite sequences) ''A''1, ..., ''A''''n'' of objects of '''C''';
* there are arrows between two objects ''A''1, ..., ''A''''m'' and ''B''1, ..., ''B''''n'' only if ''m'' = ''n'', and then the arrows are lists (finite sequences) of arrows ''f''1: ''A''1 → ''B''1, ..., ''f''''n'': ''A''''n'' → ''B''''n'' of '''C''';
* the tensor product of two objects ''A''1, ..., ''A''''n'' and ''B''1, ..., ''B''''m'' is the concatenation ''A''1, ..., ''A''''n'', ''B''1, ..., ''B''''m'' of the two lists, and, similarly, the tensor product of two morphisms is given by the concatenation of lists. The identity object is the empty list.
This operation Σ mapping category '''C''' to Σ('''C''') can be extended to a strict 2-[[Monad (category theory)|monad]] on '''Cat'''.

== Specializations ==
* If, in a monoidal category, <math>A\otimes B</math> and <math>B\otimes A</math> are naturally isomorphic in a manner compatible with the coherence conditions, we speak of a [[braided monoidal category]]. If, moreover, this natural isomorphism is its own inverse, we have a [[symmetric monoidal category]].
* A [[closed monoidal category]] is a monoidal category where the functor <math>X \mapsto X \otimes A</math> has a [[Adjoint functors|right adjoint]], which is called the "internal Hom-functor" <math>X \mapsto \mathrm{Hom}_{\mathbf C}(A , X)</math>. Examples include [[cartesian closed category|cartesian closed categories]] such as '''Set''', the category of sets, and [[compact closed category|compact closed categories]] such as '''FdVect''', the category of finite-dimensional vector spaces.
* [[Autonomous category|Autonomous categories]] (or [[Compact closed category|compact closed categories]] or [[Rigid category|rigid categories]]) are monoidal categories in which duals with nice properties exist; they abstract the idea of '''FdVect'''.
* [[Dagger symmetric monoidal category|Dagger symmetric monoidal categories]], equipped with an extra dagger functor, abstracting the idea of '''FdHilb''', finite-dimensional Hilbert spaces. These include the [[Dagger compact category|dagger compact categories]].
* [[Tannakian category|Tannakian categories]] are monoidal categories enriched over a field, which are very similar to representation categories of linear algebraic groups.

== See also ==
{{Portal|Mathematics}}
* [[Skeleton (category theory)]]
* [[Spherical category]]
* [[Monoidal category action]]

== References ==
{{Reflist}}

* [[André Joyal|Joyal, André]]; [[Ross Street|Street, Ross]] (1993). "Braided Tensor Categories". ''Advances in Mathematics'' ''102'', 20–78.
* [[André Joyal|Joyal, André]]; [[Ross Street|Street, Ross]] (1988). "[http://maths.mq.edu.au/~street/PlanarDiags.pdf Planar diagrams and tensor algebra]".
* [[Max Kelly|Kelly, G. Max]] (1964). "On MacLane's Conditions for Coherence of Natural Associativities, Commutativities, etc." ''Journal of Algebra'' ''1'', 397–402
* {{cite book | last = Kelly | first = G. Max | url = http://www.tac.mta.ca/tac/reprints/articles/10/tr10.pdf | title = Basic Concepts of Enriched Category Theory | series = London Mathematical Society Lecture Note Series No. 64 | publisher = Cambridge University Press | year = 1982}}
* [[Saunders Mac Lane|Mac Lane, Saunders]] (1963). "Natural Associativity and Commutativity". ''Rice University Studies'' ''49'', 28–46.
* Mac Lane, Saunders (1998), ''[[Categories for the Working Mathematician]]'' (2nd ed.). New York: Springer-Verlag.
*{{nlab|id=monoidal+category|title=Monoidal category}}

{{Category theory}}

[[Category:Monoidal categories| ]]

Category of topological spaces

2020-04-17T02:59:04Z

Magmalex: /* Other properties */ Added link to extremal monomorphism

In [[mathematics]], the '''category of topological spaces''', often denoted '''Top''', is the [[category (category theory)|category]] whose [[object (category theory)|object]]s are [[topological space]]s and whose [[morphism]]s are [[continuous map]]s. This is a category because the [[function composition|composition]] of two continuous maps is again continuous, and the identity function is continuous. The study of '''Top''' and of properties of [[topological space]]s using the techniques of [[category theory]] is known as '''categorical topology'''.

N.B. Some authors use the name '''Top''' for the categories with [[topological manifold]]s or with [[compactly generated space|compactly generated spaces]].as objects and continuous maps as morphisms.

==As a concrete category==

Like many categories, the category '''Top''' is a [[concrete category]], meaning its objects are [[Set (mathematics)|sets]] with additional structure (i.e. topologies) and its morphisms are [[function (mathematics)|function]]s preserving this structure. There is a natural [[forgetful functor]]
:''U'' : '''Top''' → '''Set'''
to the [[category of sets]] which assigns to each topological space the underlying set and to each continuous map the underlying [[function (mathematics)|function]].

The forgetful functor ''U'' has both a [[left adjoint]]
:''D'' : '''Set''' → '''Top'''
which equips a given set with the [[discrete topology]], and a [[right adjoint]]
:''I'' : '''Set''' → '''Top'''
which equips a given set with the [[indiscrete topology]]. Both of these functors are, in fact, [[Inverse function#Left and right inverses|right inverses]] to ''U'' (meaning that ''UD'' and ''UI'' are equal to the [[identity functor]] on '''Set'''). Moreover, since any function between discrete or between indiscrete spaces is continuous, both of these functors give [[full embedding]]s of '''Set''' into '''Top'''.

'''Top''' is also ''fiber-complete'' meaning that the [[lattice of topologies|category of all topologies]] on a given set ''X'' (called the ''[[fiber (mathematics)|fiber]]'' of ''U'' above ''X'') forms a [[complete lattice]] when ordered by [[set inclusion|inclusion]]. The [[greatest element]] in this fiber is the discrete topology on ''X'', while the [[least element]] is the indiscrete topology.

'''Top''' is the model of what is called a [[topological category]]. These categories are characterized by the fact that every [[structured source]] <math>(X \to UA_i)_I</math> has a unique [[initial lift]] <math>( A \to A_i)_I</math>. In '''Top''' the initial lift is obtained by placing the [[initial topology]] on the source. Topological categories have many properties in common with '''Top''' (such as fiber-completeness, discrete and indiscrete functors, and unique lifting of limits).

==Limits and colimits==

The category '''Top''' is both [[complete category|complete and cocomplete]], which means that all small [[limit (category theory)|limits and colimit]]s exist in '''Top'''. In fact, the forgetful functor ''U'' : '''Top''' → '''Set''' uniquely lifts both limits and colimits and preserves them as well. Therefore, (co)limits in '''Top''' are given by placing topologies on the corresponding (co)limits in '''Set'''.

Specifically, if ''F'' is a [[diagram (category theory)|diagram]] in '''Top''' and (''L'', ''φ'' : ''L'' → ''F'') is a limit of ''UF'' in '''Set''', the corresponding limit of ''F'' in '''Top''' is obtained by placing the [[initial topology]] on (''L'', ''φ'' : ''L'' → ''F''). Dually, colimits in '''Top''' are obtained by placing the [[final topology]] on the corresponding colimits in '''Set'''.

Unlike many ''algebraic'' categories, the forgetful functor ''U'' : '''Top''' → '''Set''' does not create or reflect limits since there will typically be non-universal [[cone (category theory)|cones]] in '''Top''' covering universal cones in '''Set'''.

Examples of limits and colimits in '''Top''' include:

*The [[empty set]] (considered as a topological space) is the [[initial object]] of '''Top'''; any [[singleton (mathematics)|singleton]] topological space is a [[terminal object]]. There are thus no [[zero object]]s in '''Top'''.
*The [[product (category theory)|product]] in '''Top''' is given by the [[product topology]] on the [[Cartesian product]]. The [[coproduct (category theory)|coproduct]] is given by the [[disjoint union (topology)|disjoint union]] of topological spaces.
*The [[equaliser (mathematics)#In category theory|equalizer]] of a pair of morphisms is given by placing the [[subspace topology]] on the set-theoretic equalizer. Dually, the [[coequalizer]] is given by placing the [[quotient topology]] on the set-theoretic coequalizer.
*[[Direct limit]]s and [[inverse limit]]s are the set-theoretic limits with the [[final topology]] and [[initial topology]] respectively.
*[[Adjunction space]]s are an example of [[pushout (category theory)|pushouts]] in '''Top'''.

==Other properties==
*The [[monomorphism]]s in '''Top''' are the [[injective]] continuous maps, the [[epimorphism]]s are the [[surjective]] continuous maps, and the [[isomorphism]]s are the [[homeomorphism]]s.
*The [[extremal monomorphism|extremal ]] monomorphisms are (up to isomorphism) the [[subspace topology|subspace]] embeddings. In fact, in '''Top''' all extremal monomorphisms happen to satisfy the stronger property of being [[regular monomorphism|regular]].
*The extremal epimorphisms are (essentially) the [[quotient map]]s. Every extremal epimorphism is regular.
*The split monomorphisms are (essentially) the inclusions of [[retract]]s into their ambient space.
*The split epimorphisms are (up to isomorphism) the continuous surjective maps of a space onto one of its retracts.
*There are no [[zero morphism]]s in '''Top''', and in particular the category is not [[preadditive category|preadditive]].
*'''Top''' is not [[cartesian closed category|cartesian closed]] (and therefore also not a [[topos]]) since it does not have [[exponential object]]s for all spaces. When this feature is desired, one often restricts to the full subcategory of [[compactly generated Hausdorff space]]s '''CGHaus'''.

==Relationships to other categories==

*The category of [[pointed topological space]]s '''Top'''• is a [[coslice category]] over '''Top'''.
* The [[homotopy category of topological spaces|homotopy category]] '''hTop''' has topological spaces for objects and [[homotopy equivalent|homotopy equivalence classes]] of continuous maps for morphisms. This is a [[quotient category]] of '''Top'''. One can likewise form the pointed homotopy category '''hTop'''•.
*'''Top''' contains the important category '''Haus''' of [[Hausdorff space|Hausdorff spaces]] as a [[full subcategory]]. The added structure of this subcategory allows for more epimorphisms: in fact, the epimorphisms in this subcategory are precisely those morphisms with [[dense set|dense]] [[image (mathematics)|images]] in their [[codomain]]s, so that epimorphisms need not be [[surjective]].
*'''Top''' contains the full subcategory '''CGHaus''' of [[compactly generated Hausdorff space]]s, which has the important property of being a [[Cartesian closed category]] while still containing all of the typical spaces of interest. This makes '''CGHaus''' a particularly ''convenient category of topological spaces'' that is often used in place of '''Top'''.
* The forgetful functor to '''Set''' has both a left and a right adjoint, as described above in the concrete category section.
* There is a functor to the category of [[Locale (mathematics)|locales]] '''Loc''' sending a topological space to its locale of open sets. This functor has a right adjoint that sends each locale to its topological space of points. This adjunction restricts to an equivalence between the category of [[sober space]]s and spatial locales.

== References ==

* [[Horst Herrlich|Herrlich, Horst]]: ''[https://books.google.com/books?id=Q1J7CwAAQBAJ&printsec=frontcover#v=onepage&q&f=false Topologische Reflexionen und Coreflexionen]''. Springer Lecture Notes in Mathematics 78 (1968).
* Herrlich, Horst: ''Categorical topology 1971–1981''. In: General Topology and its Relations to Modern Analysis and Algebra 5, Heldermann Verlag 1983, pp. 279–383.
* Herrlich, Horst & Strecker, George E.: [https://link.springer.com/chapter/10.1007/978-94-017-0468-7_15 Categorical Topology – its origins, as exemplified by the unfolding of the theory of topological reflections and coreflections before 1971]. In: Handbook of the History of General Topology (eds. C.E.Aull & R. Lowen), Kluwer Acad. Publ. vol 1 (1997) pp. 255–341.
* Adámek, Jiří, Herrlich, Horst, & Strecker, George E.; (1990). [http://katmat.math.uni-bremen.de/acc/acc.pdf ''Abstract and Concrete Categories''] (4.2MB PDF). Originally publ. John Wiley & Sons. {{ISBN|0-471-60922-6}}. (now free on-line edition).

[[Category:Categories in category theory|Topological spaces]]
[[Category:General topology]]

Category of rings

2017-06-08T14:52:05Z

Magmalex: Corrected link to rng of square zero

{{more footnotes|date=October 2013}}
In [[mathematics]], the '''category of rings''', denoted by '''Ring''', is the [[category (mathematics)|category]] whose objects are [[ring (mathematics)|rings]] (with identity) and whose [[morphism]]s are [[ring homomorphism]]s (preserving the identity). Like many categories in mathematics, the category of rings is large, meaning that the [[class (set theory)|class]] of all rings is [[proper class|proper]].

==As a concrete category==

The category '''Ring''' is a [[concrete category]] meaning that the objects are [[set (mathematics)|set]]s with additional structure (addition and multiplication) and the morphisms are [[function (mathematics)|function]]s preserving this structure. There is a natural [[forgetful functor]]
:''U'' : '''Ring''' → '''Set'''
for the category of rings to the [[category of sets]] which sends each ring to its underlying set (thus "forgetting" the operations of addition and multiplication). This functor has a [[left adjoint]]
:''F'' : '''Set''' → '''Ring'''
which assigns to each set ''X'' the [[free ring]] generated by ''X''.

One can also view the category of rings as a concrete category over '''Ab''' (the [[category of abelian groups]]) or over '''Mon''' (the [[category of monoids]]). Specifically, there are [[forgetful functor]]s
:''A'' : '''Ring''' → '''Ab'''
:''M'' : '''Ring''' → '''Mon'''
which "forget" multiplication and addition, respectively. Both of these functors have left adjoints. The left adjoint of ''A'' is the functor which assigns to every [[abelian group]] ''X'' (thought of as a '''Z'''-[[module (mathematics)|module]]) the [[tensor ring]] ''T''(''X''). The left adjoint of ''M'' is the functor which assigns to every [[monoid]] ''X'' the integral [[monoid ring]] '''Z'''[''M''].

==Properties==

===Limits and colimits===

The category '''Ring''' is both [[complete category|complete and cocomplete]], meaning that all small [[limits and colimits]] exist in '''Ring'''. Like many other algebraic categories, the forgetful functor ''U'' : '''Ring''' → '''Set''' [[creation of limits|creates]] (and preserves) limits and [[filtered colimit]]s, but does not preserve either [[coproduct]]s or [[coequalizer]]s. The forgetful functors to '''Ab''' and '''Mon''' also create and preserve limits.

Examples of limits and colimits in '''Ring''' include:

*The ring of [[integer]]s '''Z''' is an [[initial object]] in '''Ring'''.
*The [[zero ring]] is a [[terminal object]] in '''Ring'''.
*The [[product (category theory)|product]] in '''Ring''' is given by the [[direct product of rings]]. This is just the [[cartesian product]] of the underlying sets with addition and multiplication defined component-wise.
*The [[free product of associative algebras|coproduct of a family of rings]] exists and is given by a construction analogous to the [[free product]] of groups. The coproduct of nonzero rings can be the zero ring; in particular, this happens whenever the factors have [[relatively prime]] [[characteristic (algebra)|characteristic]] (since the characteristic of the coproduct of (''R''''i'')''i''∈''I'' must divide the characteristics of each of the rings ''R''''i'').
*The [[Equalizer (mathematics)|equalizer]] in '''Ring''' is just the set-theoretic equalizer (the equalizer of two ring homomorphisms is always a [[subring]]).
*The [[coequalizer]] of two ring homomorphisms ''f'' and ''g'' from ''R'' to ''S'' is the [[quotient ring|quotient]] of ''S'' by the [[ideal (ring theory)|ideal]] generated by all elements of the form ''f''(''r'') − ''g''(''r'') for ''r'' ∈ ''R''.
*Given a ring homomorphism ''f'' : ''R'' → ''S'' the [[kernel pair]] of ''f'' (this is just the [[pullback (category theory)|pullback]] of ''f'' with itself) is a [[congruence relation]] on ''R''. The ideal determined by this congruence relation is precisely the (ring-theoretic) [[kernel (ring theory)|kernel]] of ''f''. Note that [[category-theoretic kernel]]s do not make sense in '''Ring''' since there are no [[zero morphism]]s (see below).
*The ring of [[p-adic integer|''p''-adic integers]] is the [[inverse limit]] in '''Ring''' of a sequence of [[modular arithmetic|rings of integers mod ''p''''n'']].

===Morphisms===
{{main|ring homomorphism}}
Unlike many categories studied in mathematics, there do not always exist morphisms between pairs of objects in '''Ring'''. This is a consequence of the fact that ring homomorphisms must preserve the identity. For example, there are no morphisms from the [[zero ring]] '''0''' to any nonzero ring. A necessary condition for there to be morphisms from ''R'' to ''S'' is that the [[characteristic (algebra)|characteristic]] of ''S'' divide that of ''R''.

Note that even though some of the hom-sets are empty, the category '''Ring''' is still [[connected category|connected]] since it has an initial object.

Some special classes of morphisms in '''Ring''' include:

*[[Isomorphism]]s in '''Ring''' are the [[bijective]] ring homomorphisms.
*[[Monomorphism]]s in '''Ring''' are the [[injective]] homomorphisms. Not every monomorphism is [[regular monomorphism|regular]] however.
*Every surjective homomorphism is an [[epimorphism]] in '''Ring''', but the converse is not true. The inclusion '''Z''' → '''Q''' is a nonsurjective epimorphism. The natural ring homomorphism from any commutative ring ''R'' to any one of its [[localization of a ring|localizations]] is an epimorphism which is not necessarily surjective.
*The surjective homomorphisms can be characterized as the [[regular epimorphism|regular]] or [[extremal epimorphism]]s in '''Ring''' (these two classes coinciding).
*[[Bimorphism]]s in '''Ring''' are the injective epimorphisms. The inclusion '''Z''' → '''Q''' is an example of a bimorphism which is not an isomorphism.

===Other properties===

*The only [[injective object]] in '''Ring''' up to isomorphism is the [[zero ring]] (i.e. the terminal object).
*Lacking [[zero morphism]]s, the category of rings cannot be a [[preadditive category]]. (However, every ring—considered as a small category with a single object— is a preadditive category).
*The category of rings is a [[symmetric monoidal category]] with the [[tensor product of rings]] ⊗'''Z''' as the monoidal product and the ring of integers '''Z''' as the unit object. It follows from the [[Eckmann–Hilton theorem]], that a [[monoid (category theory)|monoid]] in '''Ring''' is just a [[commutative ring]]. The action of a monoid (= commutative ring) ''R'' on an object (= ring) ''A'' of '''Ring''' is just an [[R-algebra|''R''-algebra]].

==Subcategories==

The category of rings has a number of important [[subcategories]]. These include the [[full subcategories]] of [[commutative rings]], [[integral domain]]s, [[principal ideal domain]]s, and [[field (mathematics)|field]]s.

===Category of commutative rings===

The '''category of commutative rings''', denoted '''CRing''', is the full subcategory of '''Ring''' whose objects are all [[commutative rings]]. This category is one of the central objects of study in the subject of [[commutative algebra]].

Any ring can be made commutative by taking the [[quotient ring|quotient]] by the [[ideal (ring theory)|ideal]] generated by all elements of the form (''xy'' − ''yx''). This defines a functor '''Ring''' → '''CRing''' which is left adjoint to the inclusion functor, so that '''CRing''' is a [[reflective subcategory]] of '''Ring'''. The [[free commutative ring]] on a set of generators ''E'' is the [[polynomial ring]] '''Z'''[''E''] whose variables are taken from ''E''. This gives a left adjoint functor to the forgetful functor from '''CRing''' to '''Set'''.

'''CRing''' is limit-closed in '''Ring''', which means that limits in '''CRing''' are the same as they are in '''Ring'''. Colimits, however, are generally different. They can be formed by taking the commutative quotient of colimits in '''Ring'''. The coproduct of two commutative rings is given by the [[tensor product of rings]]. Again, the coproduct of two nonzero commutative rings can be zero.

The [[opposite category]] of '''CRing''' is [[equivalence of categories|equivalent]] to the [[category of affine schemes]]. The equivalence is given by the [[contravariant functor]] Spec which sends a commutative ring to its [[spectrum of a ring|spectrum]], an affine [[scheme (mathematics)|scheme]].

===Category of fields===

The '''category of fields''', denoted '''Field''', is the full subcategory of '''CRing''' whose objects are [[field (mathematics)|field]]s. The category of fields is not nearly as well-behaved as other algebraic categories. In particular, free fields do not exist (i.e. there is no left adjoint to the forgetful functor '''Field''' → '''Set'''). It follows that '''Field''' is ''not'' a reflective subcategory of '''CRing'''.

The category of fields is neither [[finitely complete category|finitely complete]] nor finitely cocomplete. In particular, '''Field''' has neither products nor coproducts.

Another curious aspect of the category of fields is that every morphism is a [[monomorphism]]. This follows from the fact that the only ideals in a field ''F'' are the [[zero ideal]] and ''F'' itself. One can then view morphisms in '''Field''' as [[field extension]]s.

The category of fields is not [[connected category|connected]]. There are no morphisms between fields of different [[characteristic (algebra)|characteristic]]. The connected components of '''Field''' are the full subcategories of characteristic ''p'', where ''p'' = 0 or is a [[prime number]]. Each such subcategory has an [[initial object]]: the [[prime field]] of characteristic ''p'' (which is '''Q''' if ''p'' = 0, otherwise the [[finite field]] '''F'''''p'').

==Related categories and functors==

===Category of groups===

There is a natural functor from '''Ring''' to the [[category of groups]], '''Grp''', which sends each ring ''R'' to its [[group of units]] ''U''(''R'') and each ring homomorphism to the restriction to ''U''(''R''). This functor has a [[left adjoint]] which sends each [[group (mathematics)|group]] ''G'' to the [[integral group ring]] '''Z'''[''G''].

Another functor between these categories sends each ring ''R'' to the group of units of the [[matrix ring]] M2(''R'') which acts on the [[projective line over a ring]] P(''R'').

===''R''-algebras===

Given a commutative ring ''R'' one can define the category '''''R''-Alg''' whose objects are all [[algebra (ring theory)|''R''-algebras]] and whose morphisms are ''R''-algebra homomorphisms.

The category of rings can be considered a special case. Every ring can be considered a '''Z'''-algebra is a unique way. Ring homomorphisms are precisely the '''Z'''-algebra homomorphisms. The category of rings is, therefore, [[isomorphism of categories|isomorphic]] to the category '''Z-Alg'''.<ref>{{citation|title=Sheaf Theory|volume=Volume 20|series=London Mathematical Society Lecture Note Series|first=B. R.|last=Tennison|publisher=Cambridge University Press|year=1975|isbn=9780521207843|page=74|url=https://books.google.com/books?id=oRs7AAAAIAAJ&pg=PA74}}.</ref> Many statements about the category of rings can be generalized to statements about the category of ''R''-algebras.

For each commutative ring ''R'' there is a functor '''''R''-Alg''' → '''Ring''' which forgets the ''R''-module structure. This functor has a left adjoint which sends each ring ''A'' to the [[tensor product of rings|tensor product]] ''R''⊗'''Z'''''A'', thought of as an ''R''-algebra by setting ''r''·(''s''⊗''a'') = ''rs''⊗''a''.

===Rings without identity===

Many authors do not require rings to have a multiplicative identity element and, accordingly, do not require ring homomorphism to preserve the identity (should it exist). This leads to a rather different category. For distinction we call such algebraic structures ''[[rng (algebra)|rngs]]'' and their morphisms ''rng homomorphisms''. The category of all rngs will be denoted by '''Rng'''.

The category of rings, '''Ring''', is a ''nonfull'' [[subcategory]] of '''Rng'''. Nonfull, because there are rng homomorphisms between rings which do not preserve the identity and are, therefore, not morphisms in '''Ring'''. The inclusion functor '''Ring''' → '''Rng''' has a left adjoint which formally adjoins an identity to any rng. This makes '''Ring''' into a (nonfull) [[reflective subcategory]] of '''Rng'''. The inclusion functor '''Ring''' → '''Rng''' respects limits but not colimits.

The [[zero ring]] serves as both an initial and terminal object in '''Rng''' (that is, it is a [[zero object]]). It follows that '''Rng''', like '''Grp''' but unlike '''Ring''', has [[zero morphism]]s. These are just the rng homomorphisms that map everything to 0. Despite the existence of zero morphisms, '''Rng''' is still not a [[preadditive category]]. The pointwise sum of two rng homomorphisms is generally not a rng homomorphism. Coproducts in '''Rng''' are not the same as direct sums.

There is a fully faithful functor from the category of [[abelian group]]s to '''Rng''' sending an abelian group to the associated [[Rng_(algebra)#Rng_of_square_zero|rng of square zero]].

[[Free object|Free construction]]s are less natural in '''Rng''' then they are in '''Ring'''. For example, the free rng generated by a set {''x''} is the ring of all integral polynomials over ''x'' with no constant term, while the free ring generated by {''x''} is just the [[polynomial ring]] '''Z'''[''x''].

==References==
{{Reflist}}

*{{cite book | last = Adámek | first = Jiří |author2=Horst Herrlich|author3=George E. Strecker | year = 1990 | url = http://katmat.math.uni-bremen.de/acc/acc.pdf | title = Abstract and Concrete Categories | publisher = John Wiley & Sons | isbn = 0-471-60922-6}}
*{{cite book | first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane |author2=[[Garrett Birkhoff]] | title = Algebra | edition = (3rd ed.) | publisher = American Mathematical Society | location = Providence, Rhode Island | year = 1999 | isbn = 0-8218-1646-2}}
*{{cite book | first = Saunders | last = Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | series = Graduate Texts in Mathematics '''5''' | edition = (2nd ed.) | publisher = Springer | isbn = 0-387-98403-8}}

[[Category:Categories in category theory|Rings]]
[[Category:Ring theory]]

Tsiolkovsky rocket equation

2017-04-01T21:34:43Z

Magmalex: /* External links */ better described an external link

{{More footnotes|date=February 2009}}
{{Astrodynamics |Equations}}
[[File:Tsiolkovsky rocket equation.svg|thumb|right|Rocket [[mass ratio]]s versus final velocity calculated from the rocket equation.]]

The '''Tsiolkovsky rocket equation''', or '''ideal rocket equation''', describes the motion of vehicles that follow the basic principle of a [[rocket]]: a device that can apply acceleration to itself (a [[thrust]]) by expelling part of its mass with high velocity and thereby move due to the [[conservation of momentum]]. The equation relates the [[delta-v]] (the maximum change of [[velocity]] of the rocket if no other external forces act) with the [[Specific impulse|effective exhaust velocity]] and the initial and final mass of a [[rocket]] (or other [[reaction engine]]).

For any such maneuver (or journey involving a sequence of such maneuvers):

:<math>\Delta v = v_\text{e} \ln \frac {m_0} {m_f}</math>

where:
:<math>\Delta v\ </math> is [[delta-v]] - the maximum change of [[velocity]] of the vehicle (with no external forces acting).
:<math>m_0</math> is the initial total mass, including [[propellant]].
:<math>m_f</math> is the final total mass without propellant, also known as dry mass.
:<math>v_\text{e}</math> is the effective exhaust velocity.
:<math>\ln</math> refers to the [[natural logarithm]] function.

(The equation can also be written using the [[specific impulse]] instead of the effective exhaust velocity by applying the formula <math>v_\text{e} = I_\text{sp} \cdot g_0</math> where <math>I_\text{sp}</math> is the specific impulse expressed as a time period and <math>g_0</math> is [[standard gravity]] ≈ 9.8 m/s2.)

==History==

The equation is named after [[Russian Empire|Russian]] scientist [[Konstantin Tsiolkovsky]] who independently derived it and published it in his 1903 work.<ref>К. Ціолковскій, Изслѣдованіе мировыхъ пространствъ реактивными приборами, 1903 (available online [http://epizodsspace.airbase.ru/bibl/dorev-knigi/ciolkovskiy/sm.rar here] in a [[RAR (file format)|RARed]] PDF)</ref> The equation had been derived earlier by the [[United Kingdom|British]] mathematician [[William Moore (British mathematician)|William Moore]] in 1813.<ref name=moore>{{Cite book |authorlink=William Moore (British mathematician) |last=Moore |first=William |author2= of the [[Military Academy at Woolwich]] |title=A Treatise on the Motion of Rockets. To which is added, An Essay on Naval Gunnery |location=London |date=1813 |publisher=G. and S. Robinson }}</ref> The minister [[William Leitch (scientist)|William Leitch]] who was a capable scientist also independently derived the fundamentals of rocketry in 1861.

This equation was independently derived by [[Konstantin Tsiolkovsky]] towards the end of the 19th century and is sometimes known under his name, but more often simply referred to as 'the rocket equation' (or sometimes the 'ideal rocket equation').

While the derivation of the rocket equation is a straightforward calculus exercise, Tsiolkovsky is honored as being the first to apply it to the question of whether rockets could achieve speeds necessary for space travel.

==Derivation==
Consider the following system:
[[Image:Var mass system.PNG]]

In the following derivation, "the rocket" is taken to mean "the rocket and all of its unburned propellant".

Newton's second law of motion relates external forces (<math>F_i\,</math>) to the change in linear momentum of the whole system (including rocket and exhaust) as follows:

:<math>\sum F_i = \lim_{\Delta t \to 0} \frac{P_2-P_1}{\Delta t}</math>

where <math>P_1\,</math> is the momentum of the rocket at time ''t=0'':

:<math> P_1 = \left( {m + \Delta m} \right)V</math>

and <math>P_2\,</math> is the momentum of the rocket and exhausted mass at time <math>t=\Delta t\,</math>:

:<math>P_2 = m\left(V + \Delta V \right) + \Delta m V_e</math>

and where, with respect to the observer:

:{|
| <math>V\,</math> is the velocity of the rocket at time ''t=0''
|-
| <math>V+\Delta V\,</math> is the velocity of the rocket at time <math>t=\Delta t\,</math>
|-
| <math>V_e\,</math> is the velocity of the mass added to the exhaust (and lost by the rocket) during time <math>\Delta t\,</math>
|-
| <math>m+\Delta m\,</math> is the mass of the rocket at time ''t=0''
|-
| <math>m\,</math> is the mass of the rocket at time <math>t=\Delta t\,</math>
|}

The velocity of the exhaust <math>V_e</math> in the observer frame is related to the velocity of the exhaust in the rocket frame <math>v_e</math> by (since exhaust velocity is in the negative direction)

:<math>V_e=V-v_e</math>

Solving yields:

:<math>P_2-P_1=m\Delta V-v_e\Delta m\,</math>

and, using <math>dm=-\Delta m</math>, since ejecting a positive <math>\Delta m</math> results in a decrease in mass,

:<math>\sum F_i=m\frac{dV}{dt}+v_e\frac{dm}{dt}</math>

If there are no external forces then <math>\sum F_i=0</math> ([[Momentum#Conservation|conservation of linear momentum]]) and

:<math>m\frac{dV}{dt}=-v_e\frac{dm}{dt}</math>

Assuming <math>v_e\,</math> is constant, this may be integrated to yield:

:<math>\Delta V\ = v_e \ln \frac {m_0} {m_1}</math>

or equivalently

:<math>m_1=m_0 e^{-\Delta V\ / v_e}</math>      or      <math>m_0=m_1 e^{\Delta V\ / v_e}</math>      or      <math>m_0 - m_1=m_1 (e^{\Delta V\ / v_e} - 1)</math>

where <math>m_0</math> is the initial total mass including propellant, <math>m_1</math> the final total mass, and <math>v_e</math> the velocity of the rocket exhaust with respect to the rocket (the [[specific impulse]], or, if measured in time, that multiplied by [[gravity]]-on-Earth acceleration).

The value <math>m_0 - m_1</math> is the total mass of propellant expended, and hence:

:<math>M_f = 1-\frac {m_1} {m_0}=1-e^{-\Delta V\ / v_\text{e}}</math>

where <math>M_f</math> is the [[propellant mass fraction]] (the part of the initial total mass that is spent as [[working mass]]).

<math>\Delta V\ </math> ([[delta v]]) is the integration over time of the magnitude of the acceleration produced by using the rocket engine (what would be the actual acceleration if external forces were absent). In free space, for the case of acceleration in the direction of the velocity, this is the increase of the speed. In the case of an acceleration in opposite direction (deceleration) it is the decrease of the speed. Of course gravity and drag also accelerate the vehicle, and they can add or subtract to the change in velocity experienced by the vehicle. Hence delta-v is not usually the actual change in speed or velocity of the vehicle.

If [[special relativity]] is taken into account, the following equation can be derived for a [[relativistic rocket]],<ref>Forward, Robert L. [http://www.relativitycalculator.com/images/rocket_equations/AIAA.pdf "A Transparent Derivation of the Relativistic Rocket Equation"] (see the right side of equation 15 on the last page, with R as the ratio of initial to final mass and w as the exhaust velocity, corresponding to ve in the notation of this article)</ref> with <math>\Delta v</math> again standing for the rocket's final velocity (after burning off all its fuel and being reduced to a rest mass of <math>m_1</math>) in the [[inertial frame of reference]] where the rocket started at rest (with the rest mass including fuel being <math>m_0</math> initially), and <math>c</math> standing for the [[speed of light]] in a vacuum:

:<math>\frac{m_0}{m_1} = \left[\frac{1 + {\frac{\Delta v}{c}}}{1 - {\frac{\Delta v}{c}}}\right]^{\frac{c}{2v_e}}</math>

Writing <math>\frac{m_0}{m_1}</math> as <math>R</math>, a little algebra allows this equation to be rearranged as

:<math>\frac{\Delta v}{c} = \frac{R^{\frac{2v_e}{c}} - 1}{R^{\frac{2v_e}{c}} + 1}</math>

Then, using the [[Identity (mathematics)|identity]] <math>R^{\frac{2v_e}{c}} = \exp \left[ \frac{2v_e}{c} \ln R \right]</math> (here "exp" denotes the [[exponential function]]; ''see also'' [[Natural logarithm]] as well as the "power" identity at [[Logarithm#Product, quotient, power and root|Logarithmic identities]]) and the identity <math>\tanh x = \frac{e^{2x} - 1} {e^{2x} + 1}</math> (''see'' [[Hyperbolic function]]), this is equivalent to

:<math>\Delta v = c \cdot \tanh \left(\frac {v_e}{c} \ln \frac{m_0}{m_1} \right)</math>

===Other derivations===

====Impulse-based====
The equation can also be derived from the basic integral of acceleration in the form of force (thrust) over mass.
By representing the delta-v equation as the following:
:<math>\Delta v = \int^{t1}_{t0} \frac{|T|}{{m_0}-\Delta{m} {t}} ~ dt</math>

where T is thrust, <math>m_0</math> is the initial (wet) mass and <math>\Delta m</math> is the initial mass minus the final (dry) mass,

and realising that the integral of a resultant force over time is total impulse, assuming thrust is the only force involved,
:<math>\int^{t1}_{t0} F ~ dt = J</math>

The integral is found to be:
:<math>J ~ \frac{\ln({m_0}) - \ln({m_1})}{\Delta m}</math>

Realising that impulse over the change in mass is equivalent to force over propellant mass flow rate (p), which is itself equivalent to exhaust velocity,
:<math> \frac{J}{\Delta m} = \frac{F}{p} = V_{exh}</math>

the integral can be equated to
:<math>\Delta v = V_{exh} ~ \ln\left({\frac{m_0}{m_1}}\right)</math>

====Acceleration-based====

Imagine a rocket at rest in space with no forces exerted on it ([[Newton's laws of motion|Newton's First Law of Motion]]). However, as soon as its engine is started (clock set to 0) the rocket is expelling gas mass at a ''constant mass flow rate M'' (kg/s) and at ''exhaust velocity relative to the rocket ve'' (m/s). This creates a constant force propelling the rocket that is equal to ''M × ve''. The mass of fuel the rocket initially has on board is equal to ''m0 - mf''. It will therefore take a time that is equal to ''(m0 - mf)/M'' to burn all this fuel. Now, the rocket is subject to a constant force (M × ve), but at the same time its total weight is decreasing steadily because it's expelling gas. According to [[Newton's laws of motion|Newton's Second Law of Motion]], this can have only one consequence; its acceleration is increasing steadily. To obtain the acceleration, the propelling force has to be divided by the rocket's total mass. So, the level of acceleration at any moment (''t'') after ignition and until the fuel runs out is given by;

:<math> ~ \frac{M v_e}{m_0 - (M t)}.</math>

Since the time it takes to burn the fuel is ''(m0 - mf)/M'' the acceleration reaches its maximum of
:<math> ~ \frac{M v_e}{m_f}</math>
the moment the last fuel is expelled. Since the exhaust velocity is related to the [[specific impulse]] in unit time as
:<math>I_{\rm sp}=\frac{v_\text{e}}{g_0},</math><ref name="SINasa">{{cite web|url=http://www.grc.nasa.gov/WWW/K-12/airplane/specimp.html|title=Specific impulse|last=Benson|first=Tom|date= 2008-07-11|publisher=NASA|accessdate= 2009-12-22}}</ref>
where ''g0'' is the [[standard gravity]], the corresponding maximum [[g-force]] is
:<math> ~ \frac{M I_{\rm sp}}{m_f}.</math>
Since speed is the definite [[Integral|integration]] of acceleration, and the integration has to start at ignition and end the moment the last propellant leaves the rocket, the following definite integral yields the speed at the moment the fuel runs out;

:<math> ~ \int^{\frac{m_0-m_f}{M}}_{0} \frac{M v_e}{m_0 - (M t)} ~ dt = ~ - v_e \ln(m_f) + v_e \ln(m_0) = ~ v_e \ln\left(\frac{m_0}{m_f}\right) </math>

==Terms of the equation==

===Delta-''v''===
{{main article|Delta-v}}
Delta-''v'' (literally "[[Delta (letter)#Upper case|change]] in [[velocity]]"), symbolised as '''Δ''v''''' and pronounced ''delta-vee'', as used in [[flight dynamics (spacecraft)|spacecraft flight dynamics]], is a measure of the [[impulse (physics)|impulse]] that is needed to perform a maneuver such as launching from, or landing on a planet or moon, or an in-space [[orbital maneuver]]. It is a [[scalar (mathematics)|scalar]] that has the units of [[speed]]. As used in this context, it is ''not'' the same as the [[delta-v (physics)|physical change in velocity]] of the vehicle.

Delta-''v'' is produced by reaction engines, such as [[rocket engines]] and is proportional to the [[thrust]] per unit mass, and burn time, and is used to determine the mass of [[rocket propellant|propellant]] required for the given manoeuvre through the rocket equation.

For multiple manoeuvres, delta-''v'' sums linearly.

For interplanetary missions delta-''v'' is often plotted on a [[porkchop plot]] which displays the required mission delta-''v'' as a function of launch date.

===Mass fraction===
{{main article|Propellant mass fraction}}
In [[aerospace engineering]], the propellant mass fraction is the portion of a vehicle's mass which does not reach the destination, usually used as a measure of the vehicle's performance. In other words, the propellant mass fraction is the ratio between the propellant mass and the initial mass of the vehicle. In a spacecraft, the destination is usually an orbit, while for aircraft it is their landing location. A higher mass fraction represents less weight in a design. Another related measure is the [[payload fraction]], which is the fraction of initial weight that is payload.

===Effective exhaust velocity===
{{main article|effective exhaust velocity}}
The effective exhaust velocity is often specified as a [[specific impulse]] and they are related to each other by:

:<math>v_\text{e} = g_0 I_\text{sp},</math>

where

:<math>I_\text{sp}</math> is the specific impulse in seconds,

:<math>v_\text{e}</math> is the specific impulse measured in [[metre per second|m/s]], which is the same as the effective exhaust velocity measured in m/s (or ft/s if g is in ft/s2),

:<math>g_0</math> is the acceleration due to gravity at the Earth's surface, 9.81 m/s2 (in [[Imperial units]] 32.2 ft/s2).

==Applicability==
The rocket equation captures the essentials of rocket flight physics in a single short equation. It also holds true for rocket-like reaction vehicles whenever the effective exhaust velocity is constant, and can be summed or integrated when the effective exhaust velocity varies. The rocket equation only accounts for the reaction force from the rocket engine; it does not include other forces that may act on a rocket, such as [[aerodynamic force|aerodynamic]] or [[gravitation]]al forces. As such, when using it to calculate the propellant requirement for launch from (or powered descent to) a planet with an atmosphere, the effects of these forces must be included in the delta-V requirement (see Examples below). In what has been called "the tyranny of the rocket equation", there is a limit to the amount of [[Payload fraction|payload]] that the airship can carry, as higher amounts of propellant increment the overall weight, and thus also increase the fuel consumption.<ref>{{Cite web|url=http://www.nasa.gov/mission_pages/station/expeditions/expedition30/tryanny.html|title=NASA - The Tyranny of the Rocket Equation|website=www.nasa.gov|language=en|access-date=2016-04-18}}</ref> The equation does not apply to [[Non-rocket spacelaunch|non-rocket systems]] such as [[aerobraking]], [[Space gun|gun launch]]es, [[space elevator]]s, [[launch loop]]s, or [[tether propulsion]].

The rocket equation can be applied to [[orbital maneuver]]s in order to determine how much propellant is needed to change to a particular new orbit, or to find the new orbit as the result of a particular propellant burn. When applying to orbital maneuvers, one assumes an [[Orbital maneuver|impulsive maneuver]], in which the propellant is discharged and delta-v applied instantaneously. This assumption is relatively accurate for short-duration burns such as for mid-course corrections and orbital insertion maneuvers. As the burn duration increases, the result is less accurate due to the effect of gravity on the vehicle over the duration of the maneuver. For low-thrust, long duration propulsion, such as [[Electrically powered spacecraft propulsion|electric propulsion]], more complicated analysis based on the propagation of the spacecraft's state vector and the integration of thrust are used to predict orbital motion.

==Examples==
Assume an exhaust velocity of {{convert|4500|m/s|ft/s|sp=us}} and a <math>\Delta v</math> of {{convert|9700|m/s|ft/s|sp=us}} (Earth to [[low Earth orbit|LEO]], including <math>\Delta v</math> to overcome gravity and aerodynamic drag).

*[[Single-stage-to-orbit]] rocket: <math>1-e^{-9.7/4.5}</math> = 0.884, therefore 88.4% of the initial total mass has to be propellant. The remaining 11.6% is for the engines, the tank, and the payload.
*[[Two-stage-to-orbit]]: suppose that the first stage should provide a <math>\Delta v</math> of {{convert|5000|m/s|ft/s|sp=us}}; <math>1-e^{-5.0/4.5}</math> = 0.671, therefore 67.1% of the initial total mass has to be propellant to the first stage. The remaining mass is 32.9%. After disposing of the first stage, a mass remains equal to this 32.9%, minus the mass of the tank and engines of the first stage. Assume that this is 8% of the initial total mass, then 24.9% remains. The second stage should provide a <math>\Delta v</math> of {{convert|4700|m/s|ft/s|sp=us}}; <math>1-e^{-4.7/4.5}</math> = 0.648, therefore 64.8% of the remaining mass has to be propellant, which is 16.2% of the original total mass, and 8.7% remains for the tank and engines of the second stage, the payload, and in the case of a space shuttle, also the orbiter. Thus together 16.7% of the original launch mass is available for ''all'' engines, the tanks, and payload.

==Stages==
In the case of sequentially thrusting [[Staging (rocketry)|rocket stages]], the equation applies for each stage, where for each stage the initial mass in the equation is the total mass of the rocket after discarding the previous stage, and the final mass in the equation is the total mass of the rocket just before discarding the stage concerned. For each stage the specific impulse may be different.

For example, if 80% of the mass of a rocket is the fuel of the first stage, and 10% is the dry mass of the first stage, and 10% is the remaining rocket, then

:<math>
\begin{align}
\Delta v \ & = v_\text{e} \ln { 100 \over 100 - 80 }\\
& = v_\text{e} \ln 5 \\
& = 1.61 v_\text{e}. \\
\end{align}
</math>

With three similar, subsequently smaller stages with the same <math>v_e</math> for each stage, we have

:<math>\Delta v \ = 3 v_\text{e} \ln 5 \ = 4.83 v_\text{e} </math>

and the payload is 10%*10%*10% = 0.1% of the initial mass.

A comparable [[Single-stage-to-orbit|SSTO]] rocket, also with a 0.1% payload, could have a mass of 11.1% for fuel tanks and engines, and 88.8% for fuel. This would give

:<math>\Delta v \ = v_\text{e} \ln(100/11.2) \ = 2.19 v_\text{e}. </math>

If the motor of a new stage is ignited before the previous stage has been discarded and the simultaneously working motors have a different specific impulse (as is often the case with solid rocket boosters and a liquid-fuel stage), the situation is more complicated.

==Common misconceptions==

When viewed as a [[variable-mass system]], a rocket cannot be directly analyzed with [[Newton's second law of motion]] because the law is valid for constant-mass systems only.<ref name="plastino">{{cite journal|last=Plastino|first=Angel R. |author2=Muzzio, Juan C. |date=1992|title=On the use and abuse of Newton's second law for variable mass problems|journal=Celestial Mechanics and Dynamical Astronomy|publisher=Kluwer Academic Publishers|location=Netherlands|volume= 53|issue= 3|pages=227–232|issn=0923-2958|bibcode=1992CeMDA..53..227P|doi=10.1007/BF00052611}} "We may conclude emphasizing that Newton's second law is valid for constant mass only. When the mass varies due to accretion or ablation, [an alternate equation explicitly accounting for the changing mass] should be used."</ref><ref name=Halliday>{{cite book|last1=Halliday|last2=Resnick |title=Physics|volume=1|pages=199|quote=It is important to note that we ''cannot'' derive a general expression for Newton's second law for variable mass systems by treating the mass in '''F''' = ''d'''''P'''/''dt'' = ''d''(''M'''''v''') as a ''variable''. [...] We ''can'' use '''F''' = ''d'''''P'''/''dt'' to analyze variable mass systems ''only'' if we apply it to an ''entire system of constant mass'' having parts among which there is an interchange of mass.|isbn=0-471-03710-9}} [Emphasis as in the original]</ref><ref name=Kleppner>
{{cite book|last=Kleppner|first=Daniel|author2=Robert Kolenkow |title=An Introduction to Mechanics|publisher=McGraw-Hill|date=1973|pages=133–134|isbn=0-07-035048-5|quote=Recall that '''F''' = ''d'''''P'''/''dt'' was established for a system composed of a certain set of particles[. ... I]t is essential to deal with the same set of particles throughout the time interval[. ...] Consequently, the mass of the system can not change during the time of interest.}}</ref> It can cause confusion that the Tsiolkovsky rocket equation looks similar to the relativistic force equation <math>F = dp/dt = m \; dv/dt + v \; dm/dt</math>. Using this formula with <math>m(t)</math> as the varying mass of the rocket seems to derive Tsiolkovsky rocket equation, but this derivation is not correct. Notice that the [[effective exhaust velocity]] <math>v_e</math> doesn't even appear in this formula.

==See also==
{{Portal| Spaceflight }}
* [[Delta-v budget]]
* [[Oberth effect]] applying delta-v in a [[gravity well]] increases the final velocity
* [[Spacecraft propulsion]]
* [[Mass ratio]]
* [[Working mass]]
* [[Relativistic rocket]]
* [[Reversibility of orbits]]
* [[Variable-mass system]]s

{{Orbits}}

==References==

{{Reflist}}

==External links==
*[http://ed-thelen.org/rocket-eq.html How to derive the rocket equation]
*[http://www.relativitycalculator.com/rocket_equations.shtml Relativity Calculator - Learn Tsiolkovsky's rocket equations]
*[http://www.wolframalpha.com/input/?i=Tsiolkovsky+rocket+equation Tsiolkovsky's rocket equations plot and calculator in WolframAlpha]

{{DEFAULTSORT:Tsiolkovsky Rocket Equation}}
[[Category:Astrodynamics]]
[[Category:Equations]]
[[Category:Single-stage-to-orbit]]
[[Category:Rocket propulsion]]

Unification (computer science)

2017-02-04T15:30:29Z

Magmalex: /* Higher-order unification */ corrected ellipsis formula and converted to html

In [[logic]] and [[computer science]], '''unification''' is an algorithmic process of [[equation solving|solving]] [[equations]] between symbolic [[expression (mathematics)|expressions]].

Depending on which expressions (also called ''terms'') are allowed to occur in an equation set (also called ''unification problem''), and which expressions are considered equal, several frameworks of unification are distinguished. If higher-order variables, that is, variables representing [[function (mathematics)|function]]s, are allowed in an expression, the process is called '''higher-order unification''', otherwise '''first-order unification'''. If a solution is required to make both sides of each equation literally equal, the process is called '''syntactic''' or '''free unification''', otherwise '''semantic''' or '''equational unification''', or '''E-unification''', or '''unification modulo theory'''.

A ''solution'' of a unification problem is denoted as a [[substitution (logic)|substitution]], that is, a mapping assigning a symbolic value to each variable of the problem's expressions. A unification algorithm should compute for a given problem a ''complete'', and ''minimal'' substitution set, that is, a set covering all its solutions, and containing no redundant members. Depending on the framework, a complete and minimal substitution set may have at most one, at most finitely many, or possibly infinitely many members, or may not exist at all.<ref group=note>in this case, still a complete substitution set exists (e.g. the set of all solutions at all); however, each such set contains redundant members.</ref><ref>{{cite journal|first1=François|last1=Fages|first2=Gérard|last2=Huet|title=Complete Sets of Unifiers and Matchers in Equational Theories|journal=Theoretical Computer Science|volume=43|pages=189–200|year=1986|doi=10.1016/0304-3975(86)90175-1}}</ref> In some frameworks it is generally impossible to decide whether any solution exists. For first-order syntactical unification, Martelli and Montanari<ref name="Martelli.Montanari.1982">{{cite journal|first1=Alberto|last1=Martelli|first2=Ugo|last2=Montanari|title=An Efficient Unification Algorithm|journal=ACM Trans. Program. Lang. Syst.|volume=4|number=2|pages=258–282|date=Apr 1982|doi=10.1145/357162.357169}}</ref> gave an algorithm that reports unsolvability or computes a complete and minimal singleton substitution set containing the so-called '''most general unifier'''.

For example, using ''x'',''y'',''z'' as variables, the singleton equation set { ''[[cons]]''(''x'',''cons''(''x'',''[[Lisp (programming language)#Lists|nil]]'')) = ''cons''(2,''y'') } is a syntactic first-order unification problem that has the substitution { ''x'' ↦ 2, ''y'' ↦ ''cons''(2,''nil'') } as its only solution.
The syntactic first-order unification problem { ''y'' = ''cons''(2,''y'') } has no solution over the set of [[term (logic)|finite terms]]; however, it has the single solution { ''y'' ↦ ''cons''(2,''cons''(2,''cons''(2,...))) } over the set of [[Tree (set theory)|infinite trees]].
The semantic first-order unification problem { ''a''⋅''x'' = ''x''⋅''a'' } has each substitution of the form { ''x'' ↦ ''a''⋅...⋅''a'' } as a solution in a [[semigroup]], i.e. if (⋅) is considered [[associative]]; the same problem, viewed in an [[abelian group]], where (⋅) is considered also [[commutative]], has any substitution at all as a solution.
The singleton set { ''a'' = ''y''(''x'') } is a syntactic second-order unification problem, since ''y'' is a function variable.
One solution is { ''x'' ↦ ''a'', ''y'' ↦ ([[identity function]]) }; another one is { ''y'' ↦ ([[constant function]] mapping each value to ''a''), ''x'' ↦ ''(any value)'' }.

The first formal investigation of unification can be attributed to [[J. Alan Robinson|John Alan Robinson]],<ref name="Robinson.1965">{{cite journal | author=J.A. Robinson | title=A Machine-Oriented Logic Based on the Resolution Principle | journal=Journal of the ACM | volume=12 | number=1 | pages=23–41 |date=Jan 1965 | doi=10.1145/321250.321253}}; Here: sect.5.8, p.32</ref><ref>{{cite journal | author=J.A. Robinson | title=Computational logic: The unification computation | journal=Machine Intelligence | volume=6 | pages=63–72 | url=http://aitopics.org/sites/default/files/classic/Machine%20Intelligence%206/MI6-Ch4-Robinson.pdf | year=1971 }}</ref> who used first-order syntactical unification as a basic building block of his [[Resolution (logic)|resolution]] procedure for first-order logic, a great step forward in [[automated reasoning]] technology, as it eliminated one source of combinatorial explosion: searching for instantiation of terms. Today, automated reasoning is still the main application area of unification.
Syntactical first-order unification is used in [[logic programming]] and programming language [[type system]] implementation, especially in [[Hindley–Milner]] based [[type inference]] algorithms.
Semantic unification is used in [[SMT solver]]s, [[term rewriting]] algorithms and [[cryptographic protocol]] analysis.
Higher-order unification is used in proof assistants, for example [[Isabelle (theorem prover)|Isabelle]] and [[Twelf]], and restricted forms of higher-order unification ('''higher-order pattern unification''') are used in some programming language implementations, such as [[lambdaProlog]], as higher-order patterns are expressive, yet their associated unification procedure retains theoretical properties closer to first-order unification.

==Common formal definitions==

===Prerequisites===

Formally, a unification approach presupposes
* An infinite set ''V'' of '''variables'''. For higher-order unification, it is convenient to choose ''V'' disjoint from the set of [[Lambda term#Lambda terms|lambda-term bound variables]].
* A set ''T'' of '''terms''' such that ''V'' ⊆ ''T''. For first-order unification and higher-order unification, ''T'' is usually the set of [[Term (first-order logic)#Terms|first-order terms]] (terms built from variable and function symbols) and [[Lambda term#Lambda terms|lambda terms]] (terms containing some higher-order variables), respectively.
* A mapping ''vars'': ''T'' → [[power set|ℙ]](''V''), assigning to each term ''t'' the set ''vars''(''t'') ⊊ ''V'' of '''free variables''' occurring in ''t''.
* An '''[[equivalence relation]]''' ≡ on ''T'', indicating which terms are considered equal. For higher-order unification, usually ''t'' ≡ ''u'' if ''t'' and ''u'' are [[Lambda term#Alpha equivalence|alpha equivalent]]. For first-order E-unification, ≡ reflects the background knowledge about certain function symbols; for example, if ⊕ is considered commutative, ''t'' ≡ ''u'' if ''u'' results from ''t'' by swapping the arguments of ⊕ at some (possibly all) occurrences. <ref group=note>E.g. ''a'' ⊕ (''b'' ⊕ ''f''(''x'')) ≡ ''a'' ⊕ (''f''(''x'') ⊕ ''b'') ≡ (''b'' ⊕ ''f''(''x'')) ⊕ ''a'' ≡ (''f''(''x'') ⊕ ''b'') ⊕ ''a''</ref> If there is no background knowledge at all, then only literally, or syntactically, identical terms are considered equal; in this case, ≡ is called the '''[[free theory]]''' (because it is a [[free object]]), the '''[[empty theory]]''' (because the set of equational [[sentence (mathematical logic)|sentences]], or the background knowledge, is empty), the '''theory of [[uninterpreted function]]s''' (because unification is done on uninterpreted [[term (logic)|terms]]), or the '''theory of [[Algebraic specification|constructors]]''' (because all function symbols just build up data terms, rather than operating on them).

===First-order term===
{{main|Term (logic)}}
Given a set ''V'' of variable symbols, a set ''C'' of constant symbols and sets ''F''''n'' of ''n''-ary function symbols, also called operator symbols, for each natural number ''n'' ≥ 1, the set of (unsorted first-order) terms ''T'' is [[recursive definition|recursively defined]] to be the smallest set with the following properties:<ref>{{cite book| author1=C.C. Chang |author1link= Chen Chung Chang|author2= H. Jerome Keisler| author2link=Howard Jerome Keisler| title=Model Theory| year=1977| volume=73| publisher=North Holland| editor=A. Heyting and H.J. Keisler and A. Mostowski and A. Robinson and P. Suppes| series=Studies in Logic and the Foundation of Mathematics}}; here: Sect.1.3</ref>
* every variable symbol is a term: ''V'' ⊆ ''T'',
* every constant symbol is a term: ''C'' ⊆ ''T'',
* from every ''n'' terms ''t''1,...,''t''''n'', and every ''n''-ary function symbol ''f'' ∈ ''F''''n'', a larger term ''f''(''t''1,...,''t''''n'') can be built.
For example, if ''x'' ∈ ''V'' is a variable symbol, 1 ∈ ''C'' is a constant symbol, and ''add'' ∈ ''F''2 is a binary function symbol, then ''x'' ∈ ''T'', 1 ∈ ''T'', and (hence) ''add''(''x'',1) ∈ ''T'' by the first, second, and third term building rule, respectively. The latter term is usually written as ''x''+1, using [[infix notation]] and the more common operator symbol + for convenience.


===Higher-order term===
{{main|Lambda calculus}}

===Substitution===
{{main|Substitution (logic)}}
A '''substitution''' is a mapping σ: ''V'' → ''T'' from variables to terms; the notation {{math|{{mset| ''x''1 ↦ ''t''1, ..., ''x''''k'' ↦ ''t''''k'' }}}} refers to a substitution mapping each variable ''x''''i'' to the term ''t''''i'', for ''i''=1,...,''k'', and every other variable to itself. '''Applying''' that substitution to a term ''t'' is written in [[postfix notation]] as {{math|''t'' {{mset|''x''1 ↦ ''t''1, ..., ''x''''k'' ↦ ''t''''k''}}}}; it means to (simultaneously) replace every occurrence of each variable ''x''''i'' in the term ''t'' by ''t''''i''. The result ''t''σ of applying a substitution σ to a term ''t'' is called an '''instance''' of that term ''t''.
As a first-order example, applying the substitution {{math|{{mset| ''x'' ↦ ''h''(''a'',''y''), ''z'' ↦ ''b'' }}}} to the term
{|
|-
|
| ''f''(
| align="center" | '''''x'''''
|, ''a'', ''g''(
| '''''z'''''
| ), ''y'')
|-
| yields  
|-
|
| ''f''(
| '''''h'''''('''''a''''','''''y''''')
|, ''a'', ''g''(
| '''''b'''''
| ), ''y'').
|}

===Generalization, specialization===

If a term ''t'' has an instance equivalent to a term ''u'', that is, if {{math|1=''tσ'' ≡ ''u''}} for some substitution σ, then ''t'' is called '''more general''' than ''u'', and ''u'' is called '''more special''' than, or '''subsumed''' by, ''t''. For example,{{math|1= ''x'' ⊕ ''a''}} is more general than {{math|1=''a'' ⊕ ''b''}} if ⊕ is [[Commutative property|commutative]], since then {{math|1=(''x'' ⊕ ''a'') {{mset|''x''↦''b''}} = ''b'' ⊕ ''a'' ≡ ''a'' ⊕ ''b''}}.

If ≡ is literal (syntactic) identity of terms, a term may be both more general and more special than another one only if both terms differ just in their variable names, not in their syntactic structure; such terms are called '''variants''', or '''renamings''' of each other.
For example,
{{math|''f''(''x''1,''a'',''g''(''z''1),''y''1)}}
is a variant of
{{math|''f''(''x''2,''a'',''g''(''z''2),''y''2)}},
since

{{math|''f''(''x''1,''a'',''g''(''z''1),''y''1)
{{mset|''x''1 ↦ ''x''2, ''y''1 ↦ ''y''2, ''z''1 ↦ ''z''2}} {{=}}
''f''(''x''2,''a'',''g''(''z''2),''y''2)}}
and
{{math|''f''(''x''2,''a'',''g''(''z''2),''y''2)
{{mset|''x''2 ↦ ''x''1, ''y''2 ↦ ''y''1, ''z''2 ↦ ''z''1}} {{=}}
''f''(''x''1,''a'',''g''(''z''1),''y''1)}}.
However,
{{math|''f''(''x''1,''a'',''g''(''z''1),''y''1)}}
is ''not'' a variant of
{{math|''f''(''x''2,''a'',''g''(''x''2),''x''2)}},
since no substitution can transform the latter term into the former one.
The latter term is therefore properly more special than the former one.

For arbitrary ≡, a term may be both more general and more special than a structurally different term.
For example, if ⊕ is [[idempotent]], that is, if always {{math|1=''x'' ⊕ ''x'' ≡ ''x''}}, then the term {{math|1=''x'' ⊕ ''y''}} is more general than {{math|1=(''x'' ⊕ ''y'') {{mset|''x'' ↦ ''z'', ''y'' ↦ ''z''}} = ''z'' ⊕ ''z'' ≡ ''z''}}, and vice versa ''z'' is more general than {{math|1=''z'' {{mset|''z'' ↦ ''x'' ⊕ ''y''}} = ''x'' ⊕ ''y''}}, although {{math|1=''x''⊕''y''}} and ''z'' are of different structure.

A substitution {{mvar|σ}} is '''more special''' than, or '''subsumed''' by, a substitution {{mvar|τ}} if {{mvar|tσ}} is more special than {{mvar|tτ}} for each term {{mvar|t}}. We also say that {{mvar|τ}} is more general than {{mvar|σ}}.
For instance {{math|1={{mset| ''x'' ↦ ''a'', ''y'' ↦ ''a'' }}}} is more special than {{math|1=τ = {{mset| ''x'' ↦ ''y'' }}}},
but
{{math|1=''σ'' = {{mset| ''x'' ↦ ''a'' }}}} is not,
as {{math|1=''f''(''x'',''y'')''σ'' = ''f''(''a'',''y'')}} is not more special than
{{math|1=''f''(''x'',''y'')''τ'' = ''f''(''y'',''y'')}}.<ref>K.R. Apt. "From Logic Programming to Prolog", p. 24. Prentice Hall, 1997.</ref>

===Unification problem, solution set===

A '''unification problem''' is a finite set {{math|{{mset| ''l''1 ≐ ''r''1, ..., ''l''''n'' ≐ ''r''''n'' }}}} of potential equations, where {{math|''l''''i'', ''r''''i'' ∈ ''T''}}.
A substitution σ is a '''solution''' of that problem if {{math|''l''''i''σ ≡ ''r''''i''σ}} for {{math|1=''i''=1,...,''n''}}. Such a substitution is also called a '''unifier''' of the unification problem.
For example, if ⊕ is [[Associative property|associative]], the unification problem { ''x'' ⊕ ''a'' ≐ ''a'' ⊕ ''x'' } has the solutions {''x'' ↦ ''a''}, {''x'' ↦ ''a'' ⊕ ''a''}, {''x'' ↦ ''a'' ⊕ ''a'' ⊕ ''a''}, etc., while the problem { ''x'' ⊕ ''a'' ≐ ''a'' } has no solution.

For a given unification problem, a set ''S'' of unifiers is called '''complete''' if each solution substitution is subsumed by some substitution σ ∈ ''S''; the set ''S'' is called '''minimal''' if none of its members subsumes another one.

==Syntactic unification of first-order terms==

[[File:Triangle diagram of syntactic unification svg.svg|thumb|Schematic triangle diagram of syntactically unifying terms ''t''1 and ''t''2 by a substitution σ]]
''Syntactic unification of first-order terms'' is the most widely used unification framework.
It is based on ''T'' being the set of ''first-order terms'' (over some given set ''V'' of variables, ''C'' of constants and ''F''''n'' of ''n''-ary function symbols) and on ≡ being ''syntactic equality''.
In this framework, each solvable unification problem {{math|{{mset|''l''1 ≐ ''r''1, ..., ''l''''n'' ≐ ''r''''n''}}}} has a complete, and obviously minimal, [[Singleton (mathematics)|singleton]] solution set {{math|1={{mset|''σ''}}}}.
Its member {{mvar|σ}} is called the '''most general unifier''' ('''mgu''') of the problem.
The terms on the left and the right hand side of each potential equation become syntactically equal when the mgu is applied i.e. {{math|1=''l''1''σ'' = ''r''1''σ'' ∧ ... ∧ ''l''''n''''σ'' = ''r''''n''''σ''}}.
Any unifier of the problem is subsumed<ref group=note>formally: each unifier τ satisfies {{math|1=∀''x'': ''xτ'' = (''xσ'')''ρ''}} for some substitution ρ</ref> by the mgu {{mvar|σ}}.
The mgu is unique up to variants: if ''S''1 and ''S''2 are both complete and minimal solution sets of the same syntactical unification problem, then ''S''1 = { ''σ''1 } and ''S''2 = { ''σ''2 } for some substitutions {{math|1=''σ''1}} and {{math|1=''σ''2,}} and {{math|1=''xσ''1}} is a variant of {{math|1=''xσ''2}} for each variable ''x'' occurring in the problem.

For example, the unification problem { ''x'' ≐ ''z'', ''y'' ≐ ''f''(''x'') } has a unifier { ''x'' ↦ ''z'', ''y'' ↦ ''f''(''z'') }, because
:{|
|-
| align="right" | ''x''
| { ''x'' ↦ ''z'', ''y'' ↦ ''f''(''z'') }
| =
| align="center" | ''z''
| =
| align="right" | ''z''
| { ''x'' ↦ ''z'', ''y'' ↦ ''f''(''z'') }
|, and
|-
| align="right" | ''y''
| { ''x'' ↦ ''z'', ''y'' ↦ ''f''(''z'') }
| =
| align="center" | ''f''(''z'')
| =
| align="right" | ''f''(''x'')
| { ''x'' ↦ ''z'', ''y'' ↦ ''f''(''z'') }
| .
|}

This is also the most general unifier.
Other unifiers for the same problem are e.g. { ''x'' ↦ ''f''(''x''1), ''y'' ↦ ''f''(''f''(''x''1)), ''z'' ↦ ''f''(''x''1) }, { ''x'' ↦ ''f''(''f''(''x''1)), ''y'' ↦ ''f''(''f''(''f''(''x''1))), ''z'' ↦ ''f''(''f''(''x''1)) }, and so on; there are infinitely many similar unifiers.

As another example, the problem ''g''(''x'',''x'') ≐ ''f''(''y'') has no solution with respect to ≡ being literal identity, since any substitution applied to the left and right hand side will keep the outermost ''g'' and ''f'', respectively, and terms with different outermost function symbols are syntactically different.

===A unification algorithm===

{{Quote box|title=Robinson's 1965 unification algorithm
|quote={{hidden begin}}
Symbols are ordered such that variables precede function symbols.
Terms are ordered by increasing written length; equally long terms
are ordered [[lexicographic order|lexicographically]].{{refn|Robinson (1965);<ref name="Robinson.1965"/> nr.2.5, 2.14, p.25}} For a set ''T'' of terms, its disagreement
path ''p'' is the lexicographically least path where two member terms
of ''T'' differ. Its disagreement set is the set of [[term (logic)#Operations with terms|subterms starting at ''p'']],
formally: {{math|{ ''t''[[term (logic)#Operations with terms|{{pipe}}''p'']] : ''t''∈''T'' }}}.{{refn|Robinson (1965);<ref name="Robinson.1965"/> nr.5.6, p.32}}

'''Algorithm:'''{{refn|Robinson (1965);<ref name="Robinson.1965"/> nr.5.8, p.32}}

Given a set ''T'' of terms to be unified
Let σ initially be the [[substitution (logic)#First-order logic|identity substitution]]

'''do''' '''forever'''
'''if''' ''T''σ is a [[singleton set]] '''then'''
'''return''' σ
'''fi'''

let ''D'' be the disagreement set of ''T''σ
let ''s'', ''t'' be the two lexicographically least terms in ''D''

'''if''' ''s'' is not a variable '''or''' ''s'' occurs in ''t'' '''then'''
'''return''' "NONUNIFIABLE"
'''fi'''
σ := σ { ''s''↦''t'' }
'''done'''
{{hidden end}}
}}
The first algorithm given by Robinson (1965) was rather inefficient; cf. box.
The following faster algorithm originated from Martelli, Montanari (1982).<ref>Alg.1, p.261. Their rule '''(a)''' corresponds to rule '''swap''' here, '''(b)''' to '''delete''', '''(c)''' to both '''decompose''' and '''conflict''', and '''(d)''' to both '''eliminate''' and '''check'''.</ref>
This paper also lists preceding attempts to find an efficient syntactical unification algorithm,<ref>{{cite report | author=Lewis Denver Baxter | title=A practically linear unification algorithm | publisher=Univ. of Waterloo, Ontario | type=Res. Report | volume=CS-76-13 | url=https://cs.uwaterloo.ca/research/tr/1976/CS-76-13.pdf |date=Feb 1976 }}</ref><ref>{{cite thesis | author=[[Gérard Huet]] | title=Resolution d'Equations dans des Langages d'Ordre 1,2,...ω | publisher=Universite de Paris VII | type=These d'etat |date=Sep 1976 }}</ref><ref name="Martelli.Montanari.1976">{{cite report |author1=Alberto Martelli |author2=Ugo Montanari |lastauthoramp=yes | title=Unification in linear time and space: A structured presentation | publisher=Consiglio Nazionale delle Ricerche, Pisa | type=Internal Note | volume=IEI-B76-16 | url=http://puma.isti.cnr.it/publichtml/section_cnr_iei/cnr_iei_1976-B4-041.html |date=Jul 1976 }}</ref><ref name="Paterson.Wegman.1978">{{cite journal | author=[[Michael Stewart Paterson]] and M.N. Wegman | title=Linear unification | journal=J. Comput. Syst. Sci. | volume=16 | number=2 | pages=158–167 | url=http://www.sciencedirect.com/science/article/pii/0022000078900430/pdf?md5=404ce04b363525aef2a1277b2ec249d1&pid=1-s2.0-0022000078900430-main.pdf |date=Apr 1978 | doi = 10.1016/0022-0000(78)90043-0 }}</ref><ref>{{cite book | author=[[J.A. Robinson]] |chapter= Fast unification | editor= [[Woodrow W. Bledsoe]], Michael M. Richter| title=Proc. Theorem Proving Workshop Oberwolfach | publisher= | series=Oberwolfach Workshop Report | volume=1976/3 | url= http://oda.mfo.de/bsz325106819.html |date=Jan 1976 }}</ref><ref>{{cite journal | author=M. Venturini-Zilli | title=Complexity of the unification algorithm for first-order expressions |journal= Calcolo | volume=12 |number=4 |pages= 361–372 |date= Oct 1975 }}</ref> and states that linear-time algorithms were discovered independently by Martelli, Montanari (1976)<ref name="Martelli.Montanari.1976"/> and Paterson, Wegman (1978).<ref name="Paterson.Wegman.1978"/>{{refn|See Martelli, Montanari (1982),<ref name="Martelli.Montanari.1982"/> sect.1, p.259. Paterson's and Wegman's paper is dated 1978; however, the journal publisher received it in Sep.1976.}}

Given a finite set ''G'' = { ''s''1 ≐ ''t''1, ..., ''s''''n'' ≐ ''t''''n'' } of potential equations,
the algorithm applies rules to transform it to an equivalent set of equations of the form
{ ''x''1 ≐ ''u''1, ..., ''x''''m'' ≐ ''u''''m'' }
where ''x''1, ..., ''x''''m'' are distinct variables and ''u''1, ..., ''u''''m'' are terms containing none of the ''x''''i''.
A set of this form can be read as a substitution.
If there is no solution the algorithm terminates with ⊥; other authors use "Ω", "{}", or "''fail''" in that case.
The operation of substituting all occurrences of variable ''x'' in problem ''G'' with term ''t'' is denoted ''G'' {''x'' ↦ ''t''}.
For simplicity, constant symbols are regarded as function symbols having zero arguments.

:{|
| align="right" | ''G'' ∪ { ''t'' ≐ ''t'' }
| ⇒
| ''G''
|
|     '''delete'''
|-
| align="right" | ''G'' ∪ { ''f''(''s''0,...,''s''''k'') ≐ ''f''(''t''0,...,''t''''k'') }
| ⇒
| ''G'' ∪ { ''s''0 ≐ ''t''0, ..., ''s''''k'' ≐ ''t''''k'' }
|
|     '''decompose'''
|-
| align="right" | ''G'' ∪ { ''f''(''s''0,...,''s''''k'') ≐ ''g''(''t''0,...,''t''''m'') }
| ⇒
| ⊥
| align="right" | if ''f'' ≠ ''g'' or ''k'' ≠ ''m''
|     '''conflict'''
|-
| align="right" | ''G'' ∪ { ''f''(''s''0,...,''s''''k'') ≐ x }
| ⇒
| ''G'' ∪ { ''x'' ≐ ''f''(''s''0,...,''s''''k'') }
|
|     '''swap'''
|-
| align="right" | ''G'' ∪ { ''x'' ≐ ''t'' }
| ⇒
| ''G''{''x''↦''t''} ∪ { ''x'' ≐ ''t'' }
| align="right" | if ''x'' ∉ ''vars''(''t'') and ''x'' ∈ ''vars''(''G'')
|     '''eliminate'''<ref group="note">Although the rule keeps ''x''≐''t'' in ''G'', it cannot loop forever since its precondition ''x''∈''vars''(''G'') is invalidated by its first application. More generally, the algorithm is guaranteed to terminate always, see [[#Proof of termination|below]].</ref>
|-
| align="right" | ''G'' ∪ { ''x'' ≐ ''f''(''s''0,...,''s''''k'') }
| ⇒
| ⊥
| align="right" | if ''x'' ∈ ''vars''(''f''(''s''0,...,''s''''k''))
|     '''check'''
|}

====Occurs check====
{{main|Occurs check}}
An attempt to unify a variable ''x'' with a term containing ''x'' as a strict subterm ''x''≐''f''(...,''x'',...) would lead to an infinite term as solution for ''x'', since ''x'' would occur as a subterm of itself.
In the set of (finite) first-order terms as defined above, the equation ''x''≐''f''(...,''x'',...) has no solution; hence the ''eliminate'' rule may only be applied if ''x'' ∉ ''vars''(''t'').
Since that additional check, called ''occurs check'', slows down the algorithm, it is omitted e.g. in most Prolog systems.
From a theoretical point of view, omitting the check amounts to solving equations over infinite trees, see [[#Unification of infinite terms|below]].

====Proof of termination====
For the proof of termination of the algorithm consider a triple {{math|<''n''''var'',''n''''lhs'',''n''''eqn''>}}
where {{math|''n''''var''}} is the number of variables that occur more than once in the equation set, {{math|''n''''lhs''}} is the number of function symbols and constants
on the left hand sides of potential equations, and {{math|''n''''eqn''}} is the number of equations.
When rule ''eliminate'' is applied, {{math|''n''''var''}} decreases, since ''x'' is eliminated from ''G'' and kept only in { ''x'' ≐ ''t'' }.
Applying any other rule can never increase {{math|''n''''var''}} again.
When rule ''decompose'', ''conflict'', or ''swap'' is applied, {{math|''n''''lhs''}} decreases, since at least the left hand side's outermost ''f'' disappears.
Applying any of the remaining rules ''delete'' or ''check'' can't increase {{math|''n''''lhs''}}, but decreases {{math|''n''''eqn''}}.
Hence, any rule application decreases the triple {{math|<''n''''var'',''n''''lhs'',''n''''eqn''>}} with respect to the [[lexicographical order]], which is possible only a finite number of times.

[[Conor McBride]] observes<ref>{{cite journal|last=McBride|first=Conor|title=First-Order Unification by Structural Recursion|journal=Journal of Functional Programming|date=October 2003|volume=13|issue=6|pages=1061–1076|doi=10.1017/S0956796803004957|url=http://strictlypositive.org/unify.ps.gz|accessdate=30 March 2012|issn=0956-7968}}</ref> that “by expressing the structure which unification exploits” in a [[Dependent type|dependently typed]] language such as [[Epigram (programming language)|Epigram]], [[John Alan Robinson|Robinson]]'s algorithm can be made [[Structural induction|recursive on the number of variables]], in which case a separate termination proof becomes unnecessary.

===Examples of syntactic unification of first-order terms===
In the Prolog syntactical convention a symbol starting with an upper case letter is a variable name; a symbol that starts with a lowercase letter is a function symbol; the comma is used as the logical ''and'' operator.
For maths notation, ''x,y,z'' are used as variables, ''f,g'' as function symbols, and ''a,b'' as constants.
{| class="wikitable"
|-
! Prolog Notation !! Maths Notation !! Unifying Substitution !! Explanation
|-
| <code> a = a </code> || { ''a'' = ''a'' } || {} || Succeeds. ([[Tautology (logic)|tautology]])
|-
| <code> a = b </code> || { ''a'' = ''b'' } || ⊥ || ''a'' and ''b'' do not match
|-
| <code> X = X </code> || { ''x'' = ''x'' } || {} || Succeeds. ([[Tautology (logic)|tautology]])
|-
| <code> a = X </code> || { ''a'' = ''x'' } || { ''x'' ↦ ''a'' } || ''x'' is unified with the constant ''a''
|-
| <code> X = Y </code> || { ''x'' = ''y'' } || { ''x'' ↦ ''y'' } || ''x'' and ''y'' are aliased
|-
| <code> f(a,X) = f(a,b) </code> || { ''f''(''a'',''x'') = ''f''(''a'',''b'') } || { ''x'' ↦ ''b'' } || function and constant symbols match, ''x'' is unified with the constant ''b''
|-
| <code> f(a) = g(a) </code> || { ''f''(''a'') = ''g''(''a'') } || ⊥ || ''f'' and ''g'' do not match
|-
| <code> f(X) = f(Y) </code> || { ''f''(''x'') = ''f''(''y'') } || { ''x'' ↦ ''y'' } || ''x'' and ''y'' are aliased
|-
| <code> f(X) = g(Y) </code> || { ''f''(''x'') = ''g''(''y'') } || ⊥ || ''f'' and ''g'' do not match
|-
| <code> f(X) = f(Y,Z) </code> || { ''f''(''x'') = ''f''(''y'',''z'') } || ⊥ || Fails. The ''f'' function symbols have different arity
|-
| <code> f(g(X)) = f(Y) </code> || { ''f''(''g''(''x'')) = ''f''(''y'') } || { ''y'' ↦ ''g''(''x'') } || Unifies ''y'' with the term {{tmath|g(x)}}
|-
| <code> f(g(X),X) = f(Y,a) </code> || { ''f''(''g''(''x''),''x'') = ''f''(''y'',''a'') } || { ''x'' ↦ ''a'', ''y'' ↦ ''g''(''a'') } || Unifies ''x'' with constant ''a'', and ''y'' with the term {{tmath|g(a)}}
|-
| <code> X = f(X) </code> || { ''x'' = ''f''(''x'') } || should be ⊥ || Returns ⊥ in first-order logic and many modern Prolog dialects (enforced by the ''[[occurs check]]'').
Succeeds in traditional Prolog and in Prolog II, unifying ''x'' with infinite term <code>x=f(f(f(f(...))))</code>.
|-
| <code> X = Y, Y = a </code> || { ''x'' = ''y'', ''y'' = ''a'' } || { ''x'' ↦ ''a'', ''y'' ↦ ''a'' } || Both ''x'' and ''y'' are unified with the constant ''a''
|-
| <code> a = Y, X = Y </code> || { ''a'' = ''y'', ''x'' = ''y'' } || { ''x'' ↦ ''a'', ''y'' ↦ ''a'' } || As above (order of equations in set doesn't matter)
|-
| <code> X = a, b = X </code> || { ''x'' = ''a'', ''b'' = ''x'' } || ⊥ || Fails. ''a'' and ''b'' do not match, so ''x'' can't be unified with both
|}

[[File:Unification exponential blow-up svg.svg|thumb|Two terms with an exponentially larger tree for their least common instance. Its [[directed acyclic graph|dag]] representation (rightmost, orange part) is still of linear size.]]
The most general unifier of a syntactic first-order unification problem of [[Term (logic)#Operations with terms|size]] {{mvar|n}} may have a size of {{math|2''n''}}. For example, the problem {{tmath| (((a*z)*y)*x)*w \doteq w*(x*(y*(z*a))) }} has the most general unifier {{tmath| z \mapsto a, y \mapsto a*a, x \mapsto (a*a)*(a*a), w \mapsto ((a*a)*(a*a))*((a*a)*(a*a)) }}, cf. picture. In order to avoid exponential time complexity caused by such blow-up, advanced unification algorithms work on [[directed acyclic graph]]s (dags) rather than trees.{{refn|e.g. Paterson, Wegman (1978),<ref name="Paterson.Wegman.1978"/> sect.2, p.159}}

===Application: Unification in logic programming===

The concept of unification is one of the main ideas behind [[logic programming]], best known through the language [[Prolog]]. It represents the mechanism of binding the contents of variables and can be viewed as a kind of one-time assignment. In Prolog, this operation is denoted by the equality symbol <code>=</code>, but is also done when instantiating variables (see below). It is also used in other languages by the use of the equality symbol <code>=</code>, but also in conjunction with many operations including <code>+</code>, <code>-</code>, <code>*</code>, <code>/</code>. [[Type inference]] algorithms are typically based on unification.

In Prolog:
# A [[variable (programming)|variable]] which is uninstantiated—i.e. no previous unifications were performed on it—can be unified with an atom, a term, or another uninstantiated variable, thus effectively becoming its alias. In many modern Prolog dialects and in [[first-order logic]], a variable cannot be unified with a term that contains it; this is the so-called ''[[occurs check]]''.
# Two atoms can only be unified if they are identical.
# Similarly, a term can be unified with another term if the top function symbols and [[Arity|arities]] of the terms are identical and if the parameters can be unified simultaneously. Note that this is a recursive behavior.

=== Application: Type inference ===

Unification is used during type inference, for instance in the functional programming language [[Haskell (programming language)|Haskell]]. On one hand, the programmer does not need to provide type information for every function, on the other hand it is used to detect typing errors. The Haskell expression <code>True : ['a', 'b', 'c']</code> is not correctly typed. The list construction function <code>(:)</code> is of type <code>a -> [a] -> [a]</code>, and for the first argument <code>True</code> the polymorphic type variable <code>a</code> has to be unified with <code>True</code>'s type, <code>Bool</code>. The second argument, <code>['a', 'b', 'c']</code>, is of type <code>[Char]</code>, but <code>a</code> cannot be both <code>Bool</code> and <code>Char</code> at the same time.

Like for Prolog, an algorithm for type inference can be given:

# Any type variable unifies with any type expression, and is instantiated to that expression. A specific theory might restrict this rule with an occurs check.
# Two type constants unify only if they are the same type.
# Two type constructions unify only if they are applications of the same type constructor and all of their component types recursively unify.

Due to its declarative nature, the order in a sequence of unifications is (usually) unimportant.

Note that in the terminology of [[first-order logic]], an atom is a basic proposition and is unified similarly to a Prolog term.

==Order-sorted unification==
''[[Many-sorted logic#Order-sorted logic|Order-sorted logic]]'' allows one to assign a ''sort'', or ''type'', to each term, and to declare a sort ''s''1 a ''subsort'' of another sort ''s''2, commonly written as ''s''1 ⊆ ''s''2. For example, when reаsoning about biological creatures, it is useful to declare a sort ''dog'' to be a subsort of a sort ''animal''. Wherever a term of some sort ''s'' is required, a term of any subsort of ''s'' may be supplied instead.
For example, assuming a function declaration ''mother'': ''animal'' → ''animal'', and a constant declaration ''lassie'': ''dog'', the term ''mother''(''lassie'') is perfectly valid and has the sort ''animal''. In order to supply the information that the mother of a dog is a dog in turn, another declaration ''mother'': ''dog'' → ''dog'' may be issued; this is called ''function overloading'', similar to [[Overloading (programming)|overloading in programming languages]].

[[Christoph Walther|Walther]] gave a unification algorithm for terms in order-sorted logic, requiring for any two declared sorts ''s''1, ''s''2 their intersection ''s''1 ∩ ''s''2 to be declared, too: if ''x''1 and ''x''2 is a variable of sort ''s''1 and ''s''2, respectively, the equation ''x''1 ≐ ''x''2 has the solution { ''x''1 = ''x'', ''x''2 = ''x'' }, where ''x'': ''s''1 ∩ ''s''2.
<ref>{{cite journal|first1=Christoph|last1=Walther|authorlink=Christoph Walther|title=A Mechanical Solution of Schubert's Steamroller by Many-Sorted Resolution|journal=Artif. Intell.|volume=26|number=2|pages=217–224|url=http://www.inferenzsysteme.informatik.tu-darmstadt.de/media/is/publikationen/Schuberts_Steamroller_by_Many-Sorted_Resolution-AIJ-25-2-1985.pdf|year=1985|doi=10.1016/0004-3702(85)90029-3}}</ref>
After incorporating this algorithm into a clause-based automated theorem prover, he could solve a benchmark problem by translating it into order-sorted logic, thereby boiling it down an order of magnitude, as many unary predicates turned into sorts.

Smolka generalized order-sorted logic to allow for [[parametric polymorphism]].
<ref>{{cite conference|first1=Gert|last1=Smolka|title=Logic Programming with Polymorphically Order-Sorted Types|conference=Int. Workshop Algebraic and Logic Programming|publisher=Springer|series=LNCS|volume=343|pages=53–70|date=Nov 1988}}</ref>
In his framework, subsort declarations are propagated to complex type expressions.
As a programming example, a parametric sort ''list''(''X'') may be declared (with ''X'' being a type parameter as in a [[Template (C++)#Function templates|C++ template]]), and from a subsort declaration ''int'' ⊆ ''float'' the relation ''list''(''int'') ⊆ ''list''(''float'') is automatically inferred, meaning that each list of integers is also a list of floats.

Schmidt-Schauß generalized order-sorted logic to allow for term declarations.
<ref>{{cite book|first1=Manfred|last1=Schmidt-Schauß|title=Computational Aspects of an Order-Sorted Logic with Term Declarations|publisher=Springer|series=LNAI|volume=395|date=Apr 1988}}</ref>
As an example, assuming subsort declarations ''even'' ⊆ ''int'' and ''odd'' ⊆ ''int'', a term declaration like ∀''i'':''int''. (''i''+''i''):''even'' allows to declare a property of integer addition that could not be expressed by ordinary overloading.

==Unification of infinite terms==

Background on infinite trees:
* {{cite journal| author=B. Courcelle|authorlink=Bruno Courcelle| title=Fundamental Properties of Infinite Trees| journal=Theoret. Comput. Sci.| year=1983| volume=25| number=| pages=95–169| url=http://www.diku.dk/hjemmesider/ansatte/henglein/papers/courcelle1983.pdf| doi=10.1016/0304-3975(83)90059-2}}
* {{cite book| author=Michael J. Maher| chapter=Complete Axiomatizations of the Algebras of Finite, Rational and Infinite Trees| title=Proc. IEEE 3rd Annual Symp. on Logic in Computer Science, Edinburgh|date=Jul 1988| pages=348–357}}
* {{cite journal|author1=Joxan Jaffar |author2=Peter J. Stuckey | title=Semantics of Infinite Tree Logic Programming| journal=Theoretical Computer Science| year=1986| volume=46| pages=141–158| doi=10.1016/0304-3975(86)90027-7}}

Unification algorithm, Prolog II:
* {{cite book| author=A. Colmerauer| authorlink=Alain Colmerauer|title=Prolog and Infinite Trees| year=1982| pages=| publisher=Academic Press|editor1=K.L. Clark |editor2=S.-A. Tarnlund }}
* {{cite book| author=Alain Colmerauer| chapter=Equations and Inequations on Finite and Infinite Trees| title=Proc. Int. Conf. on Fifth Generation Computer Systems| year=1984| pages=85–99| editor=ICOT}}

Applications:
* {{cite journal|author1=Francis Giannesini |author2=Jacques Cohen | title=Parser Generation and Grammar Manipulation using Prolog's Infinite Trees| journal=J. Logic Programming| year=1984| volume=3| pages=253–265}}

==E-unification==

'''E-unification''' is the problem of finding solutions to a given set of [[equations]],
taking into account some equational background knowledge ''E''.
The latter is given as a set of universal [[Equality (mathematics)|equalities]].
For some particular sets ''E'', equation solving [[algorithms]] (a.k.a. ''E-unification algorithms'') have been devised;
for others it has been proven that no such algorithms can exist.

For example, if {{mvar|a}} and {{mvar|b}} are distinct constants,
the [[equation]] {{tmath|x * a \doteq y * b}} has no solution
with respect to purely [[Unification (computer science)#Syntactic unification problem on first-order terms|syntactic unification]],
where nothing is known about the operator {{tmath|*}}.
However, if the {{tmath|*}} is known to be [[Commutativity|commutative]],
then the substitution {{math|{{mset|''x'' ↦ ''b'', ''y'' ↦ ''a''}}}} solves the above equation,
since
:{|
|
| {{tmath|x * a}}
| {{math|{{mset|''x'' ↦ ''b'', ''y'' ↦ ''a''}}}}
|-
| {{=}}
| {{tmath|b * a}}
|
| by [[Unification (computer science)#Substitution|substitution application]]
|-
| {{=}}
| {{tmath|a * b}}
|
| by commutativity of {{tmath|*}}
|-
| {{=}}
| {{tmath|y * b}}
| {{math|{{mset|''x'' ↦ ''b'', ''y'' ↦ ''a''}}}}
| by (converse) substitution application
|}
The background knowledge ''E'' could state the commutativity of {{tmath|*}} by the universal equality
"{{tmath|1=u * v = v * u}} for all {{math|''u'', ''v''}}".

===Particular background knowledge sets E===

{|
|+ '''Used naming conventions'''
| {{math|∀ ''u'',''v'',''w'':}}
| align="right" | {{tmath|u*(v*w)}}
| {{=}}
| {{tmath|(u*v)*w}}
| align="center" | '''{{mvar|A}}'''
| Associativity of {{tmath|*}}
|-
| {{math|∀ ''u'',''v'':}}
| align="right" | {{tmath|u*v}}
| =
| {{tmath|v*u}}
| align="center" | '''{{mvar|C}}'''
| Commutativity of {{tmath|*}}
|-
| {{math|∀ ''u'',''v'',''w'':}}
| align="right" | {{tmath|u*(v+w)}}
| {{=}}
| {{tmath|u*v+u*w}}
| align="center" | '''{{mvar|Dl}}'''
| Left distributivity of {{tmath|*}} over {{tmath|+}}
|-
| {{math|∀ ''u'',''v'',''w'':}}
| align="right" | {{tmath|(v+w)*u}}
| {{=}}
| {{tmath|v*u+w*u}}
| align="center" | '''{{mvar|Dr}}'''
| Right distributivity of {{tmath|*}} over {{tmath|+}}
|-
| {{math|∀ ''u'':}}
| align="right" | {{tmath|u*u}}
| {{=}}
| {{mvar|u}}
| align="center" | '''{{mvar|I}}'''
| Idempotence of {{tmath|*}}
|-
| {{math|∀ ''u'':}}
| align="right" | {{tmath|n*u}}
| {{=}}
| {{mvar|u}}
| align="center" | '''{{mvar|Nl}}'''
| Left neutral element {{mvar|n}} with respect to {{tmath|*}}
|-
| {{math|∀ ''u'':}}
| align="right" | {{tmath|u*n}}
| {{=}}
| {{mvar|u}}
| align="center" |     '''{{mvar|Nr}}'''    
| Right neutral element {{mvar|n}} with respect to {{tmath|*}}
|}

It is said that ''unification is decidable'' for a theory, if a unification algorithm has been devised for it that terminates for ''any'' input problem.
It is said that ''unification is [[Decidable problem#Decidability|semi-decidable]]'' for a theory, if a unification algorithm has been devised for it that terminates for any ''solvable'' input problem, but may keep searching forever for solutions of an unsolvable input problem.

'''Unification is decidable''' for the following theories:
* '''{{mvar|A}}'''<ref>[[Gordon D. Plotkin]], ''Lattice Theoretic Properties of Subsumption'', Memorandum MIP-R-77, Univ. Edinburgh, Jun 1970</ref>
* '''{{mvar|A}}''','''{{mvar|C}}'''<ref>[[Mark E. Stickel]], ''A Unification Algorithm for Associative-Commutative Functions'', J. Assoc. Comput. Mach., vol.28, no.3, pp. 423–434, 1981</ref>
* '''{{mvar|A}}''','''{{mvar|C}}''','''{{mvar|I}}'''<ref name="Fages.1987">F. Fages, ''Associative-Commutative Unification'', J. Symbolic Comput., vol.3, no.3, pp. 257–275, 1987</ref>
* '''{{mvar|A}}''','''{{mvar|C}}''','''{{mvar|Nl}}'''<ref group=note name="LRequivC">in the presence of equality '''{{mvar|C}}''', equalities '''{{mvar|Nl}}''' and '''{{mvar|Nr}}''' are equivalent, similar for '''{{mvar|Dl}}''' and '''{{mvar|Dr}}'''</ref><ref name="Fages.1987"/>
* '''{{mvar|A}}''','''{{mvar|I}}'''<ref>Franz Baader, ''Unification in Idempotent Semigroups is of Type Zero'', J. Automat. Reasoning, vol.2, no.3, 1986</ref>
* '''{{mvar|A}}''','''{{mvar|Nl}}'''{{mvar|,}}'''{{mvar|Nr}}''' (monoid)<ref>J. Makanin, ''The Problem of Solvability of Equations in a Free Semi-Group'', Akad. Nauk SSSR, vol.233, no.2, 1977</ref>
* '''{{mvar|C}}'''<ref>{{cite journal| author=F. Fages| title=Associative-Commutative Unification| journal=J. Symbolic Comput.| year=1987| volume=3| number=3| pages=257–275| doi=10.1016/s0747-7171(87)80004-4}}</ref>
* [[Boolean ring]]s<ref>{{cite book| author=Martin, U., Nipkow, T.| chapter=Unification in Boolean Rings| title=Proc. 8th CADE| year=1986| volume=230| pages=506–513| publisher=Springer| editor=Jörg H. Siekmann| series=LNCS}}</ref><ref>{{cite journal|author1=A. Boudet |author2=J.P. Jouannaud |author3=M. Schmidt-Schauß | title=Unification of Boolean Rings and Abelian Groups| journal=Journal of Symbolic Computation| year=1989| volume=8| pages=449–477 |url=http://www.sciencedirect.com/science/article/pii/S0747717189800549/pdf?md5=713ed362e4b6f2db53923cc5ed47c818&pid=1-s2.0-S0747717189800549-main.pdf| doi=10.1016/s0747-7171(89)80054-9}}</ref>
* [[Abelian group]]s, even if the signature is expanded by arbitrary additional symbols (but not axioms)<ref name="Baader and Snyder 2001, p. 486">Baader and Snyder (2001), p. 486.</ref>
* [[Kripke semantics#Correspondence and completeness|K4]] [[modal algebra]]s<ref>F. Baader and S. Ghilardi, ''Unification in modal and description logics'', Logic Journal of the IGPL 19 (2011), no. 6, pp. 705–730.</ref>

'''Unification is semi-decidable''' for the following theories:
* '''{{mvar|A}}''','''{{mvar|Dl}}'''{{mvar|,}}'''{{mvar|Dr}}'''<ref>P. Szabo, ''Unifikationstheorie erster Ordnung'' (''First Order Unification Theory''), Thesis, Univ. Karlsruhe, West Germany, 1982</ref>
* '''{{mvar|A}}''','''{{mvar|C}}''','''{{mvar|Dl}}'''<ref group=note name="LRequivC"/><ref>Jörg H. Siekmann, ''Universal Unification'', Proc. 7th Int. Conf. on Automated Deduction, Springer LNCS vol.170, pp. 1–42, 1984</ref>
* [[Commutative ring]]s<ref name="Baader and Snyder 2001, p. 486"/>

===One-sided paramodulation===

If there is a [[Term rewriting#Termination and convergence|convergent term rewriting system]] ''R'' available for ''E'',
the '''one-sided paramodulation''' algorithm<ref>N. Dershowitz and G. Sivakumar, ''Solving Goals in Equational Languages'', Proc. 1st Int. Workshop on Conditional Term Rewriting Systems, Springer LNCS vol.308, pp. 45–55, 1988</ref>
can be used to enumerate all solutions of given equations.

{| style="border: 1px solid darkgray;"
|+ One-sided paramodulation rules
|- border="0"
| align="right" | ''G'' ∪ { ''f''(''s''1,...,''s''''n'') ≐ ''f''(''t''1,...,''t''''n'') }
| ; ''S''
| ⇒
| align="right" | ''G'' ∪ { ''s''1 ≐ ''t''1, ..., ''s''''n'' ≐ ''t''''n'' }
| ''; ''S''
|
|     '''decompose'''
|-
| align="right" | ''G'' ∪ { ''x'' ≐ ''t'' }
| ; ''S''
| ⇒
| align="right" | ''G'' { ''x'' ↦ ''t'' }
|; ''S''{''x''↦''t''} ∪ {''x''↦''t''}
| align="right" | if the variable ''x'' doesn't occur in ''t''
|     '''eliminate'''
|-
| align="right" | ''G'' ∪ { ''f''(''s''1,...,''s''''n'') ≐ ''t'' }
| ; ''S''
| ⇒
| align="right" | ''G'' ∪ { ''s''1 ≐ u1, ..., ''s''''n'' ≐ u''n'', ''r'' ≐ ''t'' }
| ; ''S''
| align="right" |     if ''f''(''u''1,...,''u''''n'') → ''r'' is a rule from ''R''
|     '''mutate'''
|-
| align="right" | ''G'' ∪ { ''f''(''s''1,...,''s''''n'') ≐ ''y'' }
| ; ''S''
|⇒
| align="right" | ''G'' ∪ { ''s''1 ≐ ''y''1, ..., ''s''''n'' ≐ ''y''''n'', ''y'' ≐ ''f''(''y''1,...,''y''''n'') }
| ; ''S''
| align="right" | if ''y''1,...,''y''''n'' are new variables
|     '''imitate'''
|}

Starting with ''G'' being the unification problem to be solved and ''S'' being the identity substitution, rules are applied nondeterministically until the empty set appears as the actual ''G'', in which case the actual ''S'' is a unifying substitution. Depending on the order the paramodulation rules are applied, on the choice of the actual equation from ''G'', and on the choice of ''R''’s rules in ''mutate'', different computations paths are possible. Only some lead to a solution, while others end at a ''G'' ≠ {} where no further rule is applicable (e.g. ''G'' = { ''f''(...) ≐ ''g''(...) }).

{| style="border: 1px solid darkgray;"
|+ Example term rewrite system ''R''
|- border="0"
| '''1'''
| ''app''(''nil'',''z'')
| → ''z''
|-
|'''2'''    
| ''app''(''x''.''y'',''z'')
| → ''x''.''app''(''y'',''z'')
|}

For an example, a term rewrite system ''R'' is used defining the ''append'' operator of lists built from ''cons'' and ''nil''; where ''cons''(''x'',''y'') is written in infix notation as ''x''.''y'' for brevity; e.g. ''app''(''a''.''b''.''nil'',''c''.''d''.''nil'') → ''a''.''app''(''b''.''nil'',''c''.''d''.''nil'') → ''a''.''b''.''app''(''nil'',''c''.''d''.''nil'') → ''a''.''b''.''c''.''d''.''nil'' demonstrates the concatenation of the lists ''a''.''b''.''nil'' and ''c''.''d''.''nil'', employing the rewrite rule 2,2, and 1. The equational theory ''E'' corresponding to ''R'' is the [[Closure (mathematics)#P closures of binary relations|congruence closure]] of ''R'', both viewed as binary relations on terms.
For example, ''app''(''a''.''b''.''nil'',''c''.''d''.''nil'') ≡ ''a''.''b''.''c''.''d''.''nil'' ≡ ''app''(''a''.''b''.''c''.''d''.''nil'',''nil''). The paramodulation algorithm enumerates solutions to equations with respect to that ''E'' when fed with the example ''R''.

A successful example computation path for the unification problem { ''app''(''x'',''app''(''y'',''x'')) ≐ ''a''.''a''.''nil'' } is shown below. To avoid variable name clashes, rewrite rules are consistently renamed each time before their use by rule ''mutate''; ''v''2, ''v''3, ... are computer-generated variable names for this purpose. In each line, the chosen equation from ''G'' is highlighted in red. Each time the ''mutate'' rule is applied, the chosen rewrite rule (''1'' or ''2'') is indicated in parentheses. From the last line, the unifying substitution ''S'' = { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' } can be obtained. In fact,
''app''(''x'',''app''(''y'',''x'')) {''y''↦''nil'', ''x''↦ ''a''.''nil'' } = ''app''(''a''.''nil'',''app''(''nil'',''a''.''nil'')) ≡ ''app''(''a''.''nil'',''a''.''nil'') ≡ ''a''.''app''(''nil'',''a''.''nil'') ≡ ''a''.''a''.''nil'' solves the given problem.
A second successful computation path, obtainable by choosing "mutate(1), mutate(2), mutate(2), mutate(1)" leads to the substitution ''S'' = { ''y'' ↦ ''a''.''a''.''nil'', ''x'' ↦ ''nil'' }; it is not shown here. No other path leads to a success.

{| class="wikitable"
|+ Example unifier computation
|-
! Used rule !! !! ''G'' !! ''S''
|-
| ||
| { {{color|red|''app''(''x'',''app''(''y'',''x'')) ≐ ''a''.''a''.''nil''}} }
| {}
|-
| mutate(2) || ⇒
| { ''x'' ≐ ''v''2.''v''3, ''app''(''y'',''x'') ≐ ''v''4, {{color|red|''v''2.''app''(''v''3,''v''4) ≐ ''a''.''a''.''nil''}} }
| {}
|-
| decompose || ⇒
| { {{color|red|''x'' ≐ ''v''2.''v''3}}, ''app''(''y'',''x'') ≐ ''v''4, ''v''2 ≐ ''a'', ''app(''v''3,''v''4) ≐ ''a''.''ni''l }
| {}
|-
| eliminate || ⇒
| { ''app''(''y'',''v''2.''v''3) ≐ ''v''4, {{color|red|''v''2 ≐ ''a''}}, ''app''(''v''3,''v''4) ≐ ''a''.''nil'' }
| { ''x'' ↦ ''v''2.''v''3 }
|-
| eliminate || ⇒
| { {{color|red|''app''(''y'',''a''.''v''3) ≐ ''v''4}}, ''app''(''v''3,''v''4) ≐ ''a''.''nil'' }
| { ''x'' ↦ ''a''.''v''3 }
|-
| mutate(1) || ⇒
| { ''y'' ≐ ''nil'', ''a''.''v''3 ≐ ''v''5, {{color|red|''v''5 ≐ ''v''4}}, ''app''(''v''3,''v''4) ≐ ''a''.''nil'' }
| { ''x'' ↦ ''a''.''v''3 }
|-
| eliminate || ⇒
| { {{color|red|''y'' ≐ ''nil''}}, ''a''.''v''3 ≐ ''v''4, ''app''(''v''3,''v''4) ≐ ''a''.''nil'' }
| { ''x'' ↦ ''a''.''v''3 }
|-
| eliminate || ⇒
| { ''a''.''v''3 ≐ ''v''4, {{color|red|''app''(''v''3,''v''4) ≐ ''a''.''nil''}} }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''v''3 }
|-
| mutate(1) || ⇒
| { ''a''.''v''3 ≐ ''v''4, ''v''3 ≐ ''nil'', {{color|red|''v''4 ≐ ''v''6}}, ''v''6 ≐ ''a''.''nil'' }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''v''3 }
|-
| eliminate || ⇒
| { ''a''.''v''3 ≐ ''v''4, {{color|red|''v''3 ≐ ''nil''}}, ''v''4 ≐ ''a''.''nil'' }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''v''3 }
|-
| eliminate || ⇒
| { ''a''.''nil'' ≐ ''v''4, {{color|red|''v''4 ≐ ''a''.''nil''}} }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' }
|-
| eliminate || ⇒
| { {{color|red|''a''.''nil'' ≐ ''a''.''nil''}} }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' }
|-
| decompose || ⇒
| { {{color|red|''a'' ≐ ''a''}}, ''nil'' ≐ ''nil'' }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' }
|-
| decompose || ⇒
| { {{color|red|''nil'' ≐ ''nil''}} }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' }
|-
| decompose     || ⇒    
| {}
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' }
|}

===Narrowing===

[[File:Triangle diagram of narrowing step svg.svg|thumb|Triangle diagram of narrowing step ''s'' ~› ''t'' at position ''p'' in term ''s'', with unifying substitution σ (bottom row), using a rewrite rule {{math|1=''l'' → ''r''}} (top row)]]
If ''R'' is a [[Term rewriting#Termination and convergence|convergent term rewriting system]] for ''E'',
an approach alternative to the previous section consists in successive application of "'''narrowing steps'''";
this will eventually enumerate all solutions of a given equation.
A narrowing step (cf. picture) consists in
* choosing a nonvariable subterm of the current term,
* [[#Syntactic unification of first-order terms|syntactically unifying]] it with the left hand side of a rule from ''R'', and
* replacing the instantiated rule's right hand side into the instantiated term.
Formally, if {{math|''l'' → ''r''}} is a [[Term (logic)#Structural equality|renamed copy]] of a rewrite rule from ''R'', having no variables in common with a term ''s'', and the [[Term (logic)#Operations with terms|subterm]] {{math|''s''{{!}}''p''}} is not a variable and is unifiable with {{mvar|l}} via the [[#Syntactic unification of first-order terms|mgu]] {{mvar|σ}}, then {{mvar|s}} can be '''narrowed''' to the term {{math|1=''t'' = ''sσ''[''rσ'']''p''}}, i.e. to the term {{mvar|sσ}}, with the subterm at ''p'' [[Term (logic)#Operations with terms|replaced]] by {{mvar|rσ}}. The situation that ''s'' can be narrowed to ''t'' is commonly denoted as ''s'' ~› ''t''.
Intuitively, a sequence of narrowing steps ''t''1 ~› ''t''2 ~› ... ~› ''t''''n'' can be thought of as a sequence of rewrite steps ''t''1 → ''t''2 → ... → ''t''''n'', but with the initial term ''t''1 being further and further instantiated, as necessary to make each of the used rules applicable.

The [[#One-sided paramodulation|above]] example paramodulation computation corresponds to the following narrowing sequence ("↓" indicating instantiation here):

{|
|-
| ''app''( || ''x'' || ,''app''(''y'', || ''x'' || ))
|-
| || ↓ || || ↓ || || || || || || || || || || || || || || ''x'' ↦ ''v''2.''v''3
|-
| ''app''( || ''v''2.''v''3 || ,''app''(''y'', || ''v''2.''v''3 || )) || → || ''v''2.''app''(''v''3,''app''( || ''y'' || ,''v''2.''v''3))
|-
| || || || || || || || ↓ || || || || || || || || || || ''y'' ↦ ''nil''
|-
| || || || || || || ''v''2.''app''(''v''3,''app''( || ''nil'' || ,''v''2.''v''3)) || → || ''v''2.''app''( || ''v''3 || ,''v''2. || ''v''3 || )
|-
| || || || || || || || || || || || ↓ || || ↓ || || || || ''v''3 ↦ ''nil''
|-
| || || || || || || || || || || ''v''2.''app''( || ''nil'' || ,''v''2. || ''nil'' || ) || → || ''v''2.''v''2.''nil''
|}

The last term, ''v''2.''v''2.''nil'' can be syntactically unified with the original right hand side term ''a''.''a''.''nil''.

The ''narrowing lemma''<ref>{{cite book| author=Fay| chapter=First-Order Unification in an Equational Theory| title=Proc. 4th Workshop on Automated Deduction| year=1979| pages=161–167}}</ref> ensures that whenever an instance of a term ''s'' can be rewritten to a term ''t'' by a convergent term rewriting system, then ''s'' and ''t'' can be narrowed and rewritten to a term {{math|1=''s''’}} and {{math|1=''t''’}}, respectively, such that {{math|1=''t''’}} is an instance of {{math|1=''s''’}}.

Formally: whenever {{math|1=''sσ'' {{underset|&lowast;|→}} ''t''}} holds for some substitution σ, then there exist terms {{math|''s''’, ''t''’}} such that {{math|''s'' {{underset|&lowast;|~›}} ''s''’}} and {{math|''t'' {{underset|&lowast;|→}} ''t''’}} and {{math|1=''s''’''τ'' = ''t''’}} for some substitution τ.

==Higher-order unification==

Many applications require one to consider the unification of typed lambda-terms instead of first-order terms. Such unification is often called ''higher-order unification''. A well studied branch of higher-order unification is the problem of unifying simply typed lambda terms modulo the equality determined by αβη conversions. Such unification problems do not have most general unifiers. While higher-order unification is [[Undecidable problem|undecidable]],<ref>{{cite journal| author=Warren D. Goldfarb| authorlink=Warren D. Goldfarb| title=The Undecidability of the Second-Order Unification Problem| journal=TCS| year=1981| volume=13| pages=225–230| url=http://www.sciencedirect.com/science/article/pii/0304397581900402/pdf?md5=ebe7687d034498bb76c4ea9c5df56f84&pid=1-s2.0-0304397581900402-main.pdf| doi=10.1016/0304-3975(81)90040-2}}</ref><ref>{{cite journal| author=Gérard P. Huet| title=The Undecidability of Unification in Third Order Logic| journal=Information and Control| year=1973| volume=22| pages=257–267 |url=http://www.sciencedirect.com/science/article/pii/S001999587390301X/pdf?md5=0833289609c3d777bdec01d5d6ced2aa&pid=1-s2.0-S001999587390301X-main.pdf |doi=10.1016/S0019-9958(73)90301-X}}</ref><ref>Claudio Lucchesi: The Undecidability of the Unification Problem for Third Order Languages (Research Report CSRR 2059; Department of Computer Science, University of Waterloo, 1972)</ref> [[Gérard Huet]] gave a [[semi-decidable]] (pre-)unification algorithm<ref>Gérard Huet: A Unification Algorithm for typed Lambda-Calculus []</ref> that allows a systematic search of the space of unifiers (generalizing the unification algorithm of Martelli-Montanari<ref name="Martelli.Montanari.1982"/> with rules for terms containing higher-order variables) that seems to work sufficiently well in practice. Huet<ref>[http://portal.acm.org/citation.cfm?id=695200 Gérard Huet: Higher Order Unification 30 Years Later]</ref> and Gilles Dowek<ref>Gilles Dowek: Higher-Order Unification and Matching. Handbook of Automated Reasoning 2001: 1009–1062</ref> have written articles surveying this topic.

[[Dale Miller (computer scientist)|Dale Miller]] has described what is now called [[higher-order pattern unification]].<ref>{{cite journal|first1=Dale|last1=Miller|title=A Logic Programming Language with Lambda-Abstraction, Function Variables, and Simple Unification|journal=Journal of Logic and Computation|year=1991|pages=497–536|url=http://www.lix.polytechnique.fr/Labo/Dale.Miller/papers/jlc91.pdf}}</ref> This subset of higher-order unification is decidable and solvable unification problems have most-general unifiers. Many computer systems that contain higher-order unification, such as the higher-order logic programming languages [[λProlog]] and [[Twelf]], often implement only the pattern fragment and not full higher-order unification.

In computational linguistics, one of the most influential theories of [[Elliptical construction|ellipsis]] is that ellipses are represented by free variables whose values are then determined using Higher-Order Unification (HOU). For instance, the semantic representation of "Jon likes Mary and Peter does too" is {{math| like(''j'', ''m'') &and; R(''p'') }} and the value of R (the semantic representation of the ellipsis) is determined by the equation {{math| like(''j'', ''m'') {{=}} R(''j'') }}. The process of solving such equations is called Higher-Order Unification.<ref>{{cite book| first1 = Claire | last1 = Gardent | first2 = Michael | last2 = Kohlhase | first3 = Karsten | last3 = Konrad | author2link=Michael Kohlhase| chapter=A Multi-Level, Higher-Order Unification Approach to Ellipsis| title=Submitted to European [[Association for Computational Linguistics]] (EACL)| year=1997| volume=| pages=| publisher=| editor=| series=|citeseerx = 10.1.1.55.9018}}</ref>

For example, the unification problem { ''f''(''a'', ''b'', ''a'') ≐ ''d''(''b'', ''a'', ''c'') }, where the only variable is ''f'', has the
solutions {''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''y'', ''x'', ''c'') }, {''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''y'', ''z'', ''c'') },
{''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''y'', ''a'', ''c'') }, {''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''b'', ''x'', ''c'') },
{''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''b'', ''z'', ''c'') } and {''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''b'', ''a'', ''c'') }.

[[Wayne Snyder]] gave a generalization of both higher-order unification and E-unification, i.e. an algorithm to unify lambda-terms modulo an equational theory.<ref>{{cite book | author=Wayne Snyder | contribution=Higher order E-unification | title=Proc. 10th [[Conference on Automated Deduction]] | publisher=Springer | series=LNAI | volume=449 | pages=573–587 |date=Jul 1990 }}</ref>

==See also==
*[[Rewriting]]
*[[Admissible rule]]
*[[Explicit substitution]] in [[lambda calculus]]
* Mathematical [[Equation solving]]
* [[Dis-unification (computer science)|Dis-unification]]: solving inequations between symbolic expression
* [[Anti-unification (computer science)|Anti-unification]]: computing a least general generalization (lgg) of two terms, dual to computing a most general instance (mgu)
* [[Ontology alignment]] (use ''unification'' with [[semantic equivalence]])

==Notes==
{{Reflist|group=note}}

==References==
{{reflist}}

== Further reading ==
* [[Franz Baader]] and [[Wayne Snyder]] (2001). [http://www.cs.bu.edu/~snyder/publications/UnifChapter.pdf "Unification Theory"]. In [[John Alan Robinson]] and [[Andrei Voronkov]], editors, ''[[Handbook of Automated Reasoning]]'', volume I, pages 447–533. Elsevier Science Publishers.
* [[Gilles Dowek]] (2001). [https://who.rocq.inria.fr/Gilles.Dowek/Publi/unification.ps "Higher-order Unification and Matching"]. In ''Handbook of Automated Reasoning''.
* Franz Baader and [[Tobias Nipkow]] (1998). [http://www.in.tum.de/~nipkow/TRaAT/ ''Term Rewriting and All That'']. Cambridge University Press.
* Franz Baader and [[Jörg H. Siekmann]] (1993). "Unification Theory". In ''Handbook of Logic in Artificial Intelligence and Logic Programming''.
* Jean-Pierre Jouannaud and [[Claude Kirchner]] (1991). "Solving Equations in Abstract Algebras: A Rule-Based Survey of Unification". In ''Computational Logic: Essays in Honor of Alan Robinson''.
* [[Nachum Dershowitz]] and [[Jean-Pierre Jouannaud]], ''Rewrite Systems'', in: [[Jan van Leeuwen]] (ed.), ''[[Handbook of Theoretical Computer Science]]'', volume B ''Formal Models and Semantics'', Elsevier, 1990, pp. 243–320
* Jörg H. Siekmann (1990). "Unification Theory". In [[Claude Kirchner]] (editor) ''Unification''. Academic Press.
* {{cite journal| author=Kevin Knight| title=Unification: A Multidisciplinary Survey| journal=ACM Computing Surveys|date=Mar 1989| volume=21| number=1| pages=93–124| url=http://www.isi.edu/natural-language/people/unification-knight.pdf| doi=10.1145/62029.62030}}
* [[Gérard Huet]] and [[Derek C. Oppen]] (1980). [http://infolab.stanford.edu/pub/cstr/reports/cs/tr/80/785/CS-TR-80-785.pdf "Equations and Rewrite Rules: A Survey"]. Technical report. Stanford University.
* {{cite journal | last1 = Raulefs | first1 = Peter | last2 = Siekmann | first2 = Jörg | last3 = Szabó | first3 = P. | last4 = Unvericht | first4 = E. | year = 1979 | title = A short survey on the state of the art in matching and unification problems | url = | journal = ACM SIGSAM Bulletin | volume = 13 | issue = 2 }}
* Claude Kirchner and Hélène Kirchner. ''Rewriting, Solving, Proving''. In preparation.

[[Category:Automated theorem proving]]
[[Category:Logic programming]]
[[Category:Rewriting systems]]
[[Category:Logic in computer science]]
[[Category:Type theory]]
[[Category:Unification (computer science)| ]]

Unification (computer science)

2017-02-04T15:09:15Z

Magmalex: /* Higher-order unification */ Corrected formula

In [[logic]] and [[computer science]], '''unification''' is an algorithmic process of [[equation solving|solving]] [[equations]] between symbolic [[expression (mathematics)|expressions]].

Depending on which expressions (also called ''terms'') are allowed to occur in an equation set (also called ''unification problem''), and which expressions are considered equal, several frameworks of unification are distinguished. If higher-order variables, that is, variables representing [[function (mathematics)|function]]s, are allowed in an expression, the process is called '''higher-order unification''', otherwise '''first-order unification'''. If a solution is required to make both sides of each equation literally equal, the process is called '''syntactic''' or '''free unification''', otherwise '''semantic''' or '''equational unification''', or '''E-unification''', or '''unification modulo theory'''.

A ''solution'' of a unification problem is denoted as a [[substitution (logic)|substitution]], that is, a mapping assigning a symbolic value to each variable of the problem's expressions. A unification algorithm should compute for a given problem a ''complete'', and ''minimal'' substitution set, that is, a set covering all its solutions, and containing no redundant members. Depending on the framework, a complete and minimal substitution set may have at most one, at most finitely many, or possibly infinitely many members, or may not exist at all.<ref group=note>in this case, still a complete substitution set exists (e.g. the set of all solutions at all); however, each such set contains redundant members.</ref><ref>{{cite journal|first1=François|last1=Fages|first2=Gérard|last2=Huet|title=Complete Sets of Unifiers and Matchers in Equational Theories|journal=Theoretical Computer Science|volume=43|pages=189–200|year=1986|doi=10.1016/0304-3975(86)90175-1}}</ref> In some frameworks it is generally impossible to decide whether any solution exists. For first-order syntactical unification, Martelli and Montanari<ref name="Martelli.Montanari.1982">{{cite journal|first1=Alberto|last1=Martelli|first2=Ugo|last2=Montanari|title=An Efficient Unification Algorithm|journal=ACM Trans. Program. Lang. Syst.|volume=4|number=2|pages=258–282|date=Apr 1982|doi=10.1145/357162.357169}}</ref> gave an algorithm that reports unsolvability or computes a complete and minimal singleton substitution set containing the so-called '''most general unifier'''.

For example, using ''x'',''y'',''z'' as variables, the singleton equation set { ''[[cons]]''(''x'',''cons''(''x'',''[[Lisp (programming language)#Lists|nil]]'')) = ''cons''(2,''y'') } is a syntactic first-order unification problem that has the substitution { ''x'' ↦ 2, ''y'' ↦ ''cons''(2,''nil'') } as its only solution.
The syntactic first-order unification problem { ''y'' = ''cons''(2,''y'') } has no solution over the set of [[term (logic)|finite terms]]; however, it has the single solution { ''y'' ↦ ''cons''(2,''cons''(2,''cons''(2,...))) } over the set of [[Tree (set theory)|infinite trees]].
The semantic first-order unification problem { ''a''⋅''x'' = ''x''⋅''a'' } has each substitution of the form { ''x'' ↦ ''a''⋅...⋅''a'' } as a solution in a [[semigroup]], i.e. if (⋅) is considered [[associative]]; the same problem, viewed in an [[abelian group]], where (⋅) is considered also [[commutative]], has any substitution at all as a solution.
The singleton set { ''a'' = ''y''(''x'') } is a syntactic second-order unification problem, since ''y'' is a function variable.
One solution is { ''x'' ↦ ''a'', ''y'' ↦ ([[identity function]]) }; another one is { ''y'' ↦ ([[constant function]] mapping each value to ''a''), ''x'' ↦ ''(any value)'' }.

The first formal investigation of unification can be attributed to [[J. Alan Robinson|John Alan Robinson]],<ref name="Robinson.1965">{{cite journal | author=J.A. Robinson | title=A Machine-Oriented Logic Based on the Resolution Principle | journal=Journal of the ACM | volume=12 | number=1 | pages=23–41 |date=Jan 1965 | doi=10.1145/321250.321253}}; Here: sect.5.8, p.32</ref><ref>{{cite journal | author=J.A. Robinson | title=Computational logic: The unification computation | journal=Machine Intelligence | volume=6 | pages=63–72 | url=http://aitopics.org/sites/default/files/classic/Machine%20Intelligence%206/MI6-Ch4-Robinson.pdf | year=1971 }}</ref> who used first-order syntactical unification as a basic building block of his [[Resolution (logic)|resolution]] procedure for first-order logic, a great step forward in [[automated reasoning]] technology, as it eliminated one source of combinatorial explosion: searching for instantiation of terms. Today, automated reasoning is still the main application area of unification.
Syntactical first-order unification is used in [[logic programming]] and programming language [[type system]] implementation, especially in [[Hindley–Milner]] based [[type inference]] algorithms.
Semantic unification is used in [[SMT solver]]s, [[term rewriting]] algorithms and [[cryptographic protocol]] analysis.
Higher-order unification is used in proof assistants, for example [[Isabelle (theorem prover)|Isabelle]] and [[Twelf]], and restricted forms of higher-order unification ('''higher-order pattern unification''') are used in some programming language implementations, such as [[lambdaProlog]], as higher-order patterns are expressive, yet their associated unification procedure retains theoretical properties closer to first-order unification.

==Common formal definitions==

===Prerequisites===

Formally, a unification approach presupposes
* An infinite set ''V'' of '''variables'''. For higher-order unification, it is convenient to choose ''V'' disjoint from the set of [[Lambda term#Lambda terms|lambda-term bound variables]].
* A set ''T'' of '''terms''' such that ''V'' ⊆ ''T''. For first-order unification and higher-order unification, ''T'' is usually the set of [[Term (first-order logic)#Terms|first-order terms]] (terms built from variable and function symbols) and [[Lambda term#Lambda terms|lambda terms]] (terms containing some higher-order variables), respectively.
* A mapping ''vars'': ''T'' → [[power set|ℙ]](''V''), assigning to each term ''t'' the set ''vars''(''t'') ⊊ ''V'' of '''free variables''' occurring in ''t''.
* An '''[[equivalence relation]]''' ≡ on ''T'', indicating which terms are considered equal. For higher-order unification, usually ''t'' ≡ ''u'' if ''t'' and ''u'' are [[Lambda term#Alpha equivalence|alpha equivalent]]. For first-order E-unification, ≡ reflects the background knowledge about certain function symbols; for example, if ⊕ is considered commutative, ''t'' ≡ ''u'' if ''u'' results from ''t'' by swapping the arguments of ⊕ at some (possibly all) occurrences. <ref group=note>E.g. ''a'' ⊕ (''b'' ⊕ ''f''(''x'')) ≡ ''a'' ⊕ (''f''(''x'') ⊕ ''b'') ≡ (''b'' ⊕ ''f''(''x'')) ⊕ ''a'' ≡ (''f''(''x'') ⊕ ''b'') ⊕ ''a''</ref> If there is no background knowledge at all, then only literally, or syntactically, identical terms are considered equal; in this case, ≡ is called the '''[[free theory]]''' (because it is a [[free object]]), the '''[[empty theory]]''' (because the set of equational [[sentence (mathematical logic)|sentences]], or the background knowledge, is empty), the '''theory of [[uninterpreted function]]s''' (because unification is done on uninterpreted [[term (logic)|terms]]), or the '''theory of [[Algebraic specification|constructors]]''' (because all function symbols just build up data terms, rather than operating on them).

===First-order term===
{{main|Term (logic)}}
Given a set ''V'' of variable symbols, a set ''C'' of constant symbols and sets ''F''''n'' of ''n''-ary function symbols, also called operator symbols, for each natural number ''n'' ≥ 1, the set of (unsorted first-order) terms ''T'' is [[recursive definition|recursively defined]] to be the smallest set with the following properties:<ref>{{cite book| author1=C.C. Chang |author1link= Chen Chung Chang|author2= H. Jerome Keisler| author2link=Howard Jerome Keisler| title=Model Theory| year=1977| volume=73| publisher=North Holland| editor=A. Heyting and H.J. Keisler and A. Mostowski and A. Robinson and P. Suppes| series=Studies in Logic and the Foundation of Mathematics}}; here: Sect.1.3</ref>
* every variable symbol is a term: ''V'' ⊆ ''T'',
* every constant symbol is a term: ''C'' ⊆ ''T'',
* from every ''n'' terms ''t''1,...,''t''''n'', and every ''n''-ary function symbol ''f'' ∈ ''F''''n'', a larger term ''f''(''t''1,...,''t''''n'') can be built.
For example, if ''x'' ∈ ''V'' is a variable symbol, 1 ∈ ''C'' is a constant symbol, and ''add'' ∈ ''F''2 is a binary function symbol, then ''x'' ∈ ''T'', 1 ∈ ''T'', and (hence) ''add''(''x'',1) ∈ ''T'' by the first, second, and third term building rule, respectively. The latter term is usually written as ''x''+1, using [[infix notation]] and the more common operator symbol + for convenience.


===Higher-order term===
{{main|Lambda calculus}}

===Substitution===
{{main|Substitution (logic)}}
A '''substitution''' is a mapping σ: ''V'' → ''T'' from variables to terms; the notation {{math|{{mset| ''x''1 ↦ ''t''1, ..., ''x''''k'' ↦ ''t''''k'' }}}} refers to a substitution mapping each variable ''x''''i'' to the term ''t''''i'', for ''i''=1,...,''k'', and every other variable to itself. '''Applying''' that substitution to a term ''t'' is written in [[postfix notation]] as {{math|''t'' {{mset|''x''1 ↦ ''t''1, ..., ''x''''k'' ↦ ''t''''k''}}}}; it means to (simultaneously) replace every occurrence of each variable ''x''''i'' in the term ''t'' by ''t''''i''. The result ''t''σ of applying a substitution σ to a term ''t'' is called an '''instance''' of that term ''t''.
As a first-order example, applying the substitution {{math|{{mset| ''x'' ↦ ''h''(''a'',''y''), ''z'' ↦ ''b'' }}}} to the term
{|
|-
|
| ''f''(
| align="center" | '''''x'''''
|, ''a'', ''g''(
| '''''z'''''
| ), ''y'')
|-
| yields  
|-
|
| ''f''(
| '''''h'''''('''''a''''','''''y''''')
|, ''a'', ''g''(
| '''''b'''''
| ), ''y'').
|}

===Generalization, specialization===

If a term ''t'' has an instance equivalent to a term ''u'', that is, if {{math|1=''tσ'' ≡ ''u''}} for some substitution σ, then ''t'' is called '''more general''' than ''u'', and ''u'' is called '''more special''' than, or '''subsumed''' by, ''t''. For example,{{math|1= ''x'' ⊕ ''a''}} is more general than {{math|1=''a'' ⊕ ''b''}} if ⊕ is [[Commutative property|commutative]], since then {{math|1=(''x'' ⊕ ''a'') {{mset|''x''↦''b''}} = ''b'' ⊕ ''a'' ≡ ''a'' ⊕ ''b''}}.

If ≡ is literal (syntactic) identity of terms, a term may be both more general and more special than another one only if both terms differ just in their variable names, not in their syntactic structure; such terms are called '''variants''', or '''renamings''' of each other.
For example,
{{math|''f''(''x''1,''a'',''g''(''z''1),''y''1)}}
is a variant of
{{math|''f''(''x''2,''a'',''g''(''z''2),''y''2)}},
since

{{math|''f''(''x''1,''a'',''g''(''z''1),''y''1)
{{mset|''x''1 ↦ ''x''2, ''y''1 ↦ ''y''2, ''z''1 ↦ ''z''2}} {{=}}
''f''(''x''2,''a'',''g''(''z''2),''y''2)}}
and
{{math|''f''(''x''2,''a'',''g''(''z''2),''y''2)
{{mset|''x''2 ↦ ''x''1, ''y''2 ↦ ''y''1, ''z''2 ↦ ''z''1}} {{=}}
''f''(''x''1,''a'',''g''(''z''1),''y''1)}}.
However,
{{math|''f''(''x''1,''a'',''g''(''z''1),''y''1)}}
is ''not'' a variant of
{{math|''f''(''x''2,''a'',''g''(''x''2),''x''2)}},
since no substitution can transform the latter term into the former one.
The latter term is therefore properly more special than the former one.

For arbitrary ≡, a term may be both more general and more special than a structurally different term.
For example, if ⊕ is [[idempotent]], that is, if always {{math|1=''x'' ⊕ ''x'' ≡ ''x''}}, then the term {{math|1=''x'' ⊕ ''y''}} is more general than {{math|1=(''x'' ⊕ ''y'') {{mset|''x'' ↦ ''z'', ''y'' ↦ ''z''}} = ''z'' ⊕ ''z'' ≡ ''z''}}, and vice versa ''z'' is more general than {{math|1=''z'' {{mset|''z'' ↦ ''x'' ⊕ ''y''}} = ''x'' ⊕ ''y''}}, although {{math|1=''x''⊕''y''}} and ''z'' are of different structure.

A substitution {{mvar|σ}} is '''more special''' than, or '''subsumed''' by, a substitution {{mvar|τ}} if {{mvar|tσ}} is more special than {{mvar|tτ}} for each term {{mvar|t}}. We also say that {{mvar|τ}} is more general than {{mvar|σ}}.
For instance {{math|1={{mset| ''x'' ↦ ''a'', ''y'' ↦ ''a'' }}}} is more special than {{math|1=τ = {{mset| ''x'' ↦ ''y'' }}}},
but
{{math|1=''σ'' = {{mset| ''x'' ↦ ''a'' }}}} is not,
as {{math|1=''f''(''x'',''y'')''σ'' = ''f''(''a'',''y'')}} is not more special than
{{math|1=''f''(''x'',''y'')''τ'' = ''f''(''y'',''y'')}}.<ref>K.R. Apt. "From Logic Programming to Prolog", p. 24. Prentice Hall, 1997.</ref>

===Unification problem, solution set===

A '''unification problem''' is a finite set {{math|{{mset| ''l''1 ≐ ''r''1, ..., ''l''''n'' ≐ ''r''''n'' }}}} of potential equations, where {{math|''l''''i'', ''r''''i'' ∈ ''T''}}.
A substitution σ is a '''solution''' of that problem if {{math|''l''''i''σ ≡ ''r''''i''σ}} for {{math|1=''i''=1,...,''n''}}. Such a substitution is also called a '''unifier''' of the unification problem.
For example, if ⊕ is [[Associative property|associative]], the unification problem { ''x'' ⊕ ''a'' ≐ ''a'' ⊕ ''x'' } has the solutions {''x'' ↦ ''a''}, {''x'' ↦ ''a'' ⊕ ''a''}, {''x'' ↦ ''a'' ⊕ ''a'' ⊕ ''a''}, etc., while the problem { ''x'' ⊕ ''a'' ≐ ''a'' } has no solution.

For a given unification problem, a set ''S'' of unifiers is called '''complete''' if each solution substitution is subsumed by some substitution σ ∈ ''S''; the set ''S'' is called '''minimal''' if none of its members subsumes another one.

==Syntactic unification of first-order terms==

[[File:Triangle diagram of syntactic unification svg.svg|thumb|Schematic triangle diagram of syntactically unifying terms ''t''1 and ''t''2 by a substitution σ]]
''Syntactic unification of first-order terms'' is the most widely used unification framework.
It is based on ''T'' being the set of ''first-order terms'' (over some given set ''V'' of variables, ''C'' of constants and ''F''''n'' of ''n''-ary function symbols) and on ≡ being ''syntactic equality''.
In this framework, each solvable unification problem {{math|{{mset|''l''1 ≐ ''r''1, ..., ''l''''n'' ≐ ''r''''n''}}}} has a complete, and obviously minimal, [[Singleton (mathematics)|singleton]] solution set {{math|1={{mset|''σ''}}}}.
Its member {{mvar|σ}} is called the '''most general unifier''' ('''mgu''') of the problem.
The terms on the left and the right hand side of each potential equation become syntactically equal when the mgu is applied i.e. {{math|1=''l''1''σ'' = ''r''1''σ'' ∧ ... ∧ ''l''''n''''σ'' = ''r''''n''''σ''}}.
Any unifier of the problem is subsumed<ref group=note>formally: each unifier τ satisfies {{math|1=∀''x'': ''xτ'' = (''xσ'')''ρ''}} for some substitution ρ</ref> by the mgu {{mvar|σ}}.
The mgu is unique up to variants: if ''S''1 and ''S''2 are both complete and minimal solution sets of the same syntactical unification problem, then ''S''1 = { ''σ''1 } and ''S''2 = { ''σ''2 } for some substitutions {{math|1=''σ''1}} and {{math|1=''σ''2,}} and {{math|1=''xσ''1}} is a variant of {{math|1=''xσ''2}} for each variable ''x'' occurring in the problem.

For example, the unification problem { ''x'' ≐ ''z'', ''y'' ≐ ''f''(''x'') } has a unifier { ''x'' ↦ ''z'', ''y'' ↦ ''f''(''z'') }, because
:{|
|-
| align="right" | ''x''
| { ''x'' ↦ ''z'', ''y'' ↦ ''f''(''z'') }
| =
| align="center" | ''z''
| =
| align="right" | ''z''
| { ''x'' ↦ ''z'', ''y'' ↦ ''f''(''z'') }
|, and
|-
| align="right" | ''y''
| { ''x'' ↦ ''z'', ''y'' ↦ ''f''(''z'') }
| =
| align="center" | ''f''(''z'')
| =
| align="right" | ''f''(''x'')
| { ''x'' ↦ ''z'', ''y'' ↦ ''f''(''z'') }
| .
|}

This is also the most general unifier.
Other unifiers for the same problem are e.g. { ''x'' ↦ ''f''(''x''1), ''y'' ↦ ''f''(''f''(''x''1)), ''z'' ↦ ''f''(''x''1) }, { ''x'' ↦ ''f''(''f''(''x''1)), ''y'' ↦ ''f''(''f''(''f''(''x''1))), ''z'' ↦ ''f''(''f''(''x''1)) }, and so on; there are infinitely many similar unifiers.

As another example, the problem ''g''(''x'',''x'') ≐ ''f''(''y'') has no solution with respect to ≡ being literal identity, since any substitution applied to the left and right hand side will keep the outermost ''g'' and ''f'', respectively, and terms with different outermost function symbols are syntactically different.

===A unification algorithm===

{{Quote box|title=Robinson's 1965 unification algorithm
|quote={{hidden begin}}
Symbols are ordered such that variables precede function symbols.
Terms are ordered by increasing written length; equally long terms
are ordered [[lexicographic order|lexicographically]].{{refn|Robinson (1965);<ref name="Robinson.1965"/> nr.2.5, 2.14, p.25}} For a set ''T'' of terms, its disagreement
path ''p'' is the lexicographically least path where two member terms
of ''T'' differ. Its disagreement set is the set of [[term (logic)#Operations with terms|subterms starting at ''p'']],
formally: {{math|{ ''t''[[term (logic)#Operations with terms|{{pipe}}''p'']] : ''t''∈''T'' }}}.{{refn|Robinson (1965);<ref name="Robinson.1965"/> nr.5.6, p.32}}

'''Algorithm:'''{{refn|Robinson (1965);<ref name="Robinson.1965"/> nr.5.8, p.32}}

Given a set ''T'' of terms to be unified
Let σ initially be the [[substitution (logic)#First-order logic|identity substitution]]

'''do''' '''forever'''
'''if''' ''T''σ is a [[singleton set]] '''then'''
'''return''' σ
'''fi'''

let ''D'' be the disagreement set of ''T''σ
let ''s'', ''t'' be the two lexicographically least terms in ''D''

'''if''' ''s'' is not a variable '''or''' ''s'' occurs in ''t'' '''then'''
'''return''' "NONUNIFIABLE"
'''fi'''
σ := σ { ''s''↦''t'' }
'''done'''
{{hidden end}}
}}
The first algorithm given by Robinson (1965) was rather inefficient; cf. box.
The following faster algorithm originated from Martelli, Montanari (1982).<ref>Alg.1, p.261. Their rule '''(a)''' corresponds to rule '''swap''' here, '''(b)''' to '''delete''', '''(c)''' to both '''decompose''' and '''conflict''', and '''(d)''' to both '''eliminate''' and '''check'''.</ref>
This paper also lists preceding attempts to find an efficient syntactical unification algorithm,<ref>{{cite report | author=Lewis Denver Baxter | title=A practically linear unification algorithm | publisher=Univ. of Waterloo, Ontario | type=Res. Report | volume=CS-76-13 | url=https://cs.uwaterloo.ca/research/tr/1976/CS-76-13.pdf |date=Feb 1976 }}</ref><ref>{{cite thesis | author=[[Gérard Huet]] | title=Resolution d'Equations dans des Langages d'Ordre 1,2,...ω | publisher=Universite de Paris VII | type=These d'etat |date=Sep 1976 }}</ref><ref name="Martelli.Montanari.1976">{{cite report |author1=Alberto Martelli |author2=Ugo Montanari |lastauthoramp=yes | title=Unification in linear time and space: A structured presentation | publisher=Consiglio Nazionale delle Ricerche, Pisa | type=Internal Note | volume=IEI-B76-16 | url=http://puma.isti.cnr.it/publichtml/section_cnr_iei/cnr_iei_1976-B4-041.html |date=Jul 1976 }}</ref><ref name="Paterson.Wegman.1978">{{cite journal | author=[[Michael Stewart Paterson]] and M.N. Wegman | title=Linear unification | journal=J. Comput. Syst. Sci. | volume=16 | number=2 | pages=158–167 | url=http://www.sciencedirect.com/science/article/pii/0022000078900430/pdf?md5=404ce04b363525aef2a1277b2ec249d1&pid=1-s2.0-0022000078900430-main.pdf |date=Apr 1978 | doi = 10.1016/0022-0000(78)90043-0 }}</ref><ref>{{cite book | author=[[J.A. Robinson]] |chapter= Fast unification | editor= [[Woodrow W. Bledsoe]], Michael M. Richter| title=Proc. Theorem Proving Workshop Oberwolfach | publisher= | series=Oberwolfach Workshop Report | volume=1976/3 | url= http://oda.mfo.de/bsz325106819.html |date=Jan 1976 }}</ref><ref>{{cite journal | author=M. Venturini-Zilli | title=Complexity of the unification algorithm for first-order expressions |journal= Calcolo | volume=12 |number=4 |pages= 361–372 |date= Oct 1975 }}</ref> and states that linear-time algorithms were discovered independently by Martelli, Montanari (1976)<ref name="Martelli.Montanari.1976"/> and Paterson, Wegman (1978).<ref name="Paterson.Wegman.1978"/>{{refn|See Martelli, Montanari (1982),<ref name="Martelli.Montanari.1982"/> sect.1, p.259. Paterson's and Wegman's paper is dated 1978; however, the journal publisher received it in Sep.1976.}}

Given a finite set ''G'' = { ''s''1 ≐ ''t''1, ..., ''s''''n'' ≐ ''t''''n'' } of potential equations,
the algorithm applies rules to transform it to an equivalent set of equations of the form
{ ''x''1 ≐ ''u''1, ..., ''x''''m'' ≐ ''u''''m'' }
where ''x''1, ..., ''x''''m'' are distinct variables and ''u''1, ..., ''u''''m'' are terms containing none of the ''x''''i''.
A set of this form can be read as a substitution.
If there is no solution the algorithm terminates with ⊥; other authors use "Ω", "{}", or "''fail''" in that case.
The operation of substituting all occurrences of variable ''x'' in problem ''G'' with term ''t'' is denoted ''G'' {''x'' ↦ ''t''}.
For simplicity, constant symbols are regarded as function symbols having zero arguments.

:{|
| align="right" | ''G'' ∪ { ''t'' ≐ ''t'' }
| ⇒
| ''G''
|
|     '''delete'''
|-
| align="right" | ''G'' ∪ { ''f''(''s''0,...,''s''''k'') ≐ ''f''(''t''0,...,''t''''k'') }
| ⇒
| ''G'' ∪ { ''s''0 ≐ ''t''0, ..., ''s''''k'' ≐ ''t''''k'' }
|
|     '''decompose'''
|-
| align="right" | ''G'' ∪ { ''f''(''s''0,...,''s''''k'') ≐ ''g''(''t''0,...,''t''''m'') }
| ⇒
| ⊥
| align="right" | if ''f'' ≠ ''g'' or ''k'' ≠ ''m''
|     '''conflict'''
|-
| align="right" | ''G'' ∪ { ''f''(''s''0,...,''s''''k'') ≐ x }
| ⇒
| ''G'' ∪ { ''x'' ≐ ''f''(''s''0,...,''s''''k'') }
|
|     '''swap'''
|-
| align="right" | ''G'' ∪ { ''x'' ≐ ''t'' }
| ⇒
| ''G''{''x''↦''t''} ∪ { ''x'' ≐ ''t'' }
| align="right" | if ''x'' ∉ ''vars''(''t'') and ''x'' ∈ ''vars''(''G'')
|     '''eliminate'''<ref group="note">Although the rule keeps ''x''≐''t'' in ''G'', it cannot loop forever since its precondition ''x''∈''vars''(''G'') is invalidated by its first application. More generally, the algorithm is guaranteed to terminate always, see [[#Proof of termination|below]].</ref>
|-
| align="right" | ''G'' ∪ { ''x'' ≐ ''f''(''s''0,...,''s''''k'') }
| ⇒
| ⊥
| align="right" | if ''x'' ∈ ''vars''(''f''(''s''0,...,''s''''k''))
|     '''check'''
|}

====Occurs check====
{{main|Occurs check}}
An attempt to unify a variable ''x'' with a term containing ''x'' as a strict subterm ''x''≐''f''(...,''x'',...) would lead to an infinite term as solution for ''x'', since ''x'' would occur as a subterm of itself.
In the set of (finite) first-order terms as defined above, the equation ''x''≐''f''(...,''x'',...) has no solution; hence the ''eliminate'' rule may only be applied if ''x'' ∉ ''vars''(''t'').
Since that additional check, called ''occurs check'', slows down the algorithm, it is omitted e.g. in most Prolog systems.
From a theoretical point of view, omitting the check amounts to solving equations over infinite trees, see [[#Unification of infinite terms|below]].

====Proof of termination====
For the proof of termination of the algorithm consider a triple {{math|<''n''''var'',''n''''lhs'',''n''''eqn''>}}
where {{math|''n''''var''}} is the number of variables that occur more than once in the equation set, {{math|''n''''lhs''}} is the number of function symbols and constants
on the left hand sides of potential equations, and {{math|''n''''eqn''}} is the number of equations.
When rule ''eliminate'' is applied, {{math|''n''''var''}} decreases, since ''x'' is eliminated from ''G'' and kept only in { ''x'' ≐ ''t'' }.
Applying any other rule can never increase {{math|''n''''var''}} again.
When rule ''decompose'', ''conflict'', or ''swap'' is applied, {{math|''n''''lhs''}} decreases, since at least the left hand side's outermost ''f'' disappears.
Applying any of the remaining rules ''delete'' or ''check'' can't increase {{math|''n''''lhs''}}, but decreases {{math|''n''''eqn''}}.
Hence, any rule application decreases the triple {{math|<''n''''var'',''n''''lhs'',''n''''eqn''>}} with respect to the [[lexicographical order]], which is possible only a finite number of times.

[[Conor McBride]] observes<ref>{{cite journal|last=McBride|first=Conor|title=First-Order Unification by Structural Recursion|journal=Journal of Functional Programming|date=October 2003|volume=13|issue=6|pages=1061–1076|doi=10.1017/S0956796803004957|url=http://strictlypositive.org/unify.ps.gz|accessdate=30 March 2012|issn=0956-7968}}</ref> that “by expressing the structure which unification exploits” in a [[Dependent type|dependently typed]] language such as [[Epigram (programming language)|Epigram]], [[John Alan Robinson|Robinson]]'s algorithm can be made [[Structural induction|recursive on the number of variables]], in which case a separate termination proof becomes unnecessary.

===Examples of syntactic unification of first-order terms===
In the Prolog syntactical convention a symbol starting with an upper case letter is a variable name; a symbol that starts with a lowercase letter is a function symbol; the comma is used as the logical ''and'' operator.
For maths notation, ''x,y,z'' are used as variables, ''f,g'' as function symbols, and ''a,b'' as constants.
{| class="wikitable"
|-
! Prolog Notation !! Maths Notation !! Unifying Substitution !! Explanation
|-
| <code> a = a </code> || { ''a'' = ''a'' } || {} || Succeeds. ([[Tautology (logic)|tautology]])
|-
| <code> a = b </code> || { ''a'' = ''b'' } || ⊥ || ''a'' and ''b'' do not match
|-
| <code> X = X </code> || { ''x'' = ''x'' } || {} || Succeeds. ([[Tautology (logic)|tautology]])
|-
| <code> a = X </code> || { ''a'' = ''x'' } || { ''x'' ↦ ''a'' } || ''x'' is unified with the constant ''a''
|-
| <code> X = Y </code> || { ''x'' = ''y'' } || { ''x'' ↦ ''y'' } || ''x'' and ''y'' are aliased
|-
| <code> f(a,X) = f(a,b) </code> || { ''f''(''a'',''x'') = ''f''(''a'',''b'') } || { ''x'' ↦ ''b'' } || function and constant symbols match, ''x'' is unified with the constant ''b''
|-
| <code> f(a) = g(a) </code> || { ''f''(''a'') = ''g''(''a'') } || ⊥ || ''f'' and ''g'' do not match
|-
| <code> f(X) = f(Y) </code> || { ''f''(''x'') = ''f''(''y'') } || { ''x'' ↦ ''y'' } || ''x'' and ''y'' are aliased
|-
| <code> f(X) = g(Y) </code> || { ''f''(''x'') = ''g''(''y'') } || ⊥ || ''f'' and ''g'' do not match
|-
| <code> f(X) = f(Y,Z) </code> || { ''f''(''x'') = ''f''(''y'',''z'') } || ⊥ || Fails. The ''f'' function symbols have different arity
|-
| <code> f(g(X)) = f(Y) </code> || { ''f''(''g''(''x'')) = ''f''(''y'') } || { ''y'' ↦ ''g''(''x'') } || Unifies ''y'' with the term {{tmath|g(x)}}
|-
| <code> f(g(X),X) = f(Y,a) </code> || { ''f''(''g''(''x''),''x'') = ''f''(''y'',''a'') } || { ''x'' ↦ ''a'', ''y'' ↦ ''g''(''a'') } || Unifies ''x'' with constant ''a'', and ''y'' with the term {{tmath|g(a)}}
|-
| <code> X = f(X) </code> || { ''x'' = ''f''(''x'') } || should be ⊥ || Returns ⊥ in first-order logic and many modern Prolog dialects (enforced by the ''[[occurs check]]'').
Succeeds in traditional Prolog and in Prolog II, unifying ''x'' with infinite term <code>x=f(f(f(f(...))))</code>.
|-
| <code> X = Y, Y = a </code> || { ''x'' = ''y'', ''y'' = ''a'' } || { ''x'' ↦ ''a'', ''y'' ↦ ''a'' } || Both ''x'' and ''y'' are unified with the constant ''a''
|-
| <code> a = Y, X = Y </code> || { ''a'' = ''y'', ''x'' = ''y'' } || { ''x'' ↦ ''a'', ''y'' ↦ ''a'' } || As above (order of equations in set doesn't matter)
|-
| <code> X = a, b = X </code> || { ''x'' = ''a'', ''b'' = ''x'' } || ⊥ || Fails. ''a'' and ''b'' do not match, so ''x'' can't be unified with both
|}

[[File:Unification exponential blow-up svg.svg|thumb|Two terms with an exponentially larger tree for their least common instance. Its [[directed acyclic graph|dag]] representation (rightmost, orange part) is still of linear size.]]
The most general unifier of a syntactic first-order unification problem of [[Term (logic)#Operations with terms|size]] {{mvar|n}} may have a size of {{math|2''n''}}. For example, the problem {{tmath| (((a*z)*y)*x)*w \doteq w*(x*(y*(z*a))) }} has the most general unifier {{tmath| z \mapsto a, y \mapsto a*a, x \mapsto (a*a)*(a*a), w \mapsto ((a*a)*(a*a))*((a*a)*(a*a)) }}, cf. picture. In order to avoid exponential time complexity caused by such blow-up, advanced unification algorithms work on [[directed acyclic graph]]s (dags) rather than trees.{{refn|e.g. Paterson, Wegman (1978),<ref name="Paterson.Wegman.1978"/> sect.2, p.159}}

===Application: Unification in logic programming===

The concept of unification is one of the main ideas behind [[logic programming]], best known through the language [[Prolog]]. It represents the mechanism of binding the contents of variables and can be viewed as a kind of one-time assignment. In Prolog, this operation is denoted by the equality symbol <code>=</code>, but is also done when instantiating variables (see below). It is also used in other languages by the use of the equality symbol <code>=</code>, but also in conjunction with many operations including <code>+</code>, <code>-</code>, <code>*</code>, <code>/</code>. [[Type inference]] algorithms are typically based on unification.

In Prolog:
# A [[variable (programming)|variable]] which is uninstantiated—i.e. no previous unifications were performed on it—can be unified with an atom, a term, or another uninstantiated variable, thus effectively becoming its alias. In many modern Prolog dialects and in [[first-order logic]], a variable cannot be unified with a term that contains it; this is the so-called ''[[occurs check]]''.
# Two atoms can only be unified if they are identical.
# Similarly, a term can be unified with another term if the top function symbols and [[Arity|arities]] of the terms are identical and if the parameters can be unified simultaneously. Note that this is a recursive behavior.

=== Application: Type inference ===

Unification is used during type inference, for instance in the functional programming language [[Haskell (programming language)|Haskell]]. On one hand, the programmer does not need to provide type information for every function, on the other hand it is used to detect typing errors. The Haskell expression <code>True : ['a', 'b', 'c']</code> is not correctly typed. The list construction function <code>(:)</code> is of type <code>a -> [a] -> [a]</code>, and for the first argument <code>True</code> the polymorphic type variable <code>a</code> has to be unified with <code>True</code>'s type, <code>Bool</code>. The second argument, <code>['a', 'b', 'c']</code>, is of type <code>[Char]</code>, but <code>a</code> cannot be both <code>Bool</code> and <code>Char</code> at the same time.

Like for Prolog, an algorithm for type inference can be given:

# Any type variable unifies with any type expression, and is instantiated to that expression. A specific theory might restrict this rule with an occurs check.
# Two type constants unify only if they are the same type.
# Two type constructions unify only if they are applications of the same type constructor and all of their component types recursively unify.

Due to its declarative nature, the order in a sequence of unifications is (usually) unimportant.

Note that in the terminology of [[first-order logic]], an atom is a basic proposition and is unified similarly to a Prolog term.

==Order-sorted unification==
''[[Many-sorted logic#Order-sorted logic|Order-sorted logic]]'' allows one to assign a ''sort'', or ''type'', to each term, and to declare a sort ''s''1 a ''subsort'' of another sort ''s''2, commonly written as ''s''1 ⊆ ''s''2. For example, when reаsoning about biological creatures, it is useful to declare a sort ''dog'' to be a subsort of a sort ''animal''. Wherever a term of some sort ''s'' is required, a term of any subsort of ''s'' may be supplied instead.
For example, assuming a function declaration ''mother'': ''animal'' → ''animal'', and a constant declaration ''lassie'': ''dog'', the term ''mother''(''lassie'') is perfectly valid and has the sort ''animal''. In order to supply the information that the mother of a dog is a dog in turn, another declaration ''mother'': ''dog'' → ''dog'' may be issued; this is called ''function overloading'', similar to [[Overloading (programming)|overloading in programming languages]].

[[Christoph Walther|Walther]] gave a unification algorithm for terms in order-sorted logic, requiring for any two declared sorts ''s''1, ''s''2 their intersection ''s''1 ∩ ''s''2 to be declared, too: if ''x''1 and ''x''2 is a variable of sort ''s''1 and ''s''2, respectively, the equation ''x''1 ≐ ''x''2 has the solution { ''x''1 = ''x'', ''x''2 = ''x'' }, where ''x'': ''s''1 ∩ ''s''2.
<ref>{{cite journal|first1=Christoph|last1=Walther|authorlink=Christoph Walther|title=A Mechanical Solution of Schubert's Steamroller by Many-Sorted Resolution|journal=Artif. Intell.|volume=26|number=2|pages=217–224|url=http://www.inferenzsysteme.informatik.tu-darmstadt.de/media/is/publikationen/Schuberts_Steamroller_by_Many-Sorted_Resolution-AIJ-25-2-1985.pdf|year=1985|doi=10.1016/0004-3702(85)90029-3}}</ref>
After incorporating this algorithm into a clause-based automated theorem prover, he could solve a benchmark problem by translating it into order-sorted logic, thereby boiling it down an order of magnitude, as many unary predicates turned into sorts.

Smolka generalized order-sorted logic to allow for [[parametric polymorphism]].
<ref>{{cite conference|first1=Gert|last1=Smolka|title=Logic Programming with Polymorphically Order-Sorted Types|conference=Int. Workshop Algebraic and Logic Programming|publisher=Springer|series=LNCS|volume=343|pages=53–70|date=Nov 1988}}</ref>
In his framework, subsort declarations are propagated to complex type expressions.
As a programming example, a parametric sort ''list''(''X'') may be declared (with ''X'' being a type parameter as in a [[Template (C++)#Function templates|C++ template]]), and from a subsort declaration ''int'' ⊆ ''float'' the relation ''list''(''int'') ⊆ ''list''(''float'') is automatically inferred, meaning that each list of integers is also a list of floats.

Schmidt-Schauß generalized order-sorted logic to allow for term declarations.
<ref>{{cite book|first1=Manfred|last1=Schmidt-Schauß|title=Computational Aspects of an Order-Sorted Logic with Term Declarations|publisher=Springer|series=LNAI|volume=395|date=Apr 1988}}</ref>
As an example, assuming subsort declarations ''even'' ⊆ ''int'' and ''odd'' ⊆ ''int'', a term declaration like ∀''i'':''int''. (''i''+''i''):''even'' allows to declare a property of integer addition that could not be expressed by ordinary overloading.

==Unification of infinite terms==

Background on infinite trees:
* {{cite journal| author=B. Courcelle|authorlink=Bruno Courcelle| title=Fundamental Properties of Infinite Trees| journal=Theoret. Comput. Sci.| year=1983| volume=25| number=| pages=95–169| url=http://www.diku.dk/hjemmesider/ansatte/henglein/papers/courcelle1983.pdf| doi=10.1016/0304-3975(83)90059-2}}
* {{cite book| author=Michael J. Maher| chapter=Complete Axiomatizations of the Algebras of Finite, Rational and Infinite Trees| title=Proc. IEEE 3rd Annual Symp. on Logic in Computer Science, Edinburgh|date=Jul 1988| pages=348–357}}
* {{cite journal|author1=Joxan Jaffar |author2=Peter J. Stuckey | title=Semantics of Infinite Tree Logic Programming| journal=Theoretical Computer Science| year=1986| volume=46| pages=141–158| doi=10.1016/0304-3975(86)90027-7}}

Unification algorithm, Prolog II:
* {{cite book| author=A. Colmerauer| authorlink=Alain Colmerauer|title=Prolog and Infinite Trees| year=1982| pages=| publisher=Academic Press|editor1=K.L. Clark |editor2=S.-A. Tarnlund }}
* {{cite book| author=Alain Colmerauer| chapter=Equations and Inequations on Finite and Infinite Trees| title=Proc. Int. Conf. on Fifth Generation Computer Systems| year=1984| pages=85–99| editor=ICOT}}

Applications:
* {{cite journal|author1=Francis Giannesini |author2=Jacques Cohen | title=Parser Generation and Grammar Manipulation using Prolog's Infinite Trees| journal=J. Logic Programming| year=1984| volume=3| pages=253–265}}

==E-unification==

'''E-unification''' is the problem of finding solutions to a given set of [[equations]],
taking into account some equational background knowledge ''E''.
The latter is given as a set of universal [[Equality (mathematics)|equalities]].
For some particular sets ''E'', equation solving [[algorithms]] (a.k.a. ''E-unification algorithms'') have been devised;
for others it has been proven that no such algorithms can exist.

For example, if {{mvar|a}} and {{mvar|b}} are distinct constants,
the [[equation]] {{tmath|x * a \doteq y * b}} has no solution
with respect to purely [[Unification (computer science)#Syntactic unification problem on first-order terms|syntactic unification]],
where nothing is known about the operator {{tmath|*}}.
However, if the {{tmath|*}} is known to be [[Commutativity|commutative]],
then the substitution {{math|{{mset|''x'' ↦ ''b'', ''y'' ↦ ''a''}}}} solves the above equation,
since
:{|
|
| {{tmath|x * a}}
| {{math|{{mset|''x'' ↦ ''b'', ''y'' ↦ ''a''}}}}
|-
| {{=}}
| {{tmath|b * a}}
|
| by [[Unification (computer science)#Substitution|substitution application]]
|-
| {{=}}
| {{tmath|a * b}}
|
| by commutativity of {{tmath|*}}
|-
| {{=}}
| {{tmath|y * b}}
| {{math|{{mset|''x'' ↦ ''b'', ''y'' ↦ ''a''}}}}
| by (converse) substitution application
|}
The background knowledge ''E'' could state the commutativity of {{tmath|*}} by the universal equality
"{{tmath|1=u * v = v * u}} for all {{math|''u'', ''v''}}".

===Particular background knowledge sets E===

{|
|+ '''Used naming conventions'''
| {{math|∀ ''u'',''v'',''w'':}}
| align="right" | {{tmath|u*(v*w)}}
| {{=}}
| {{tmath|(u*v)*w}}
| align="center" | '''{{mvar|A}}'''
| Associativity of {{tmath|*}}
|-
| {{math|∀ ''u'',''v'':}}
| align="right" | {{tmath|u*v}}
| =
| {{tmath|v*u}}
| align="center" | '''{{mvar|C}}'''
| Commutativity of {{tmath|*}}
|-
| {{math|∀ ''u'',''v'',''w'':}}
| align="right" | {{tmath|u*(v+w)}}
| {{=}}
| {{tmath|u*v+u*w}}
| align="center" | '''{{mvar|Dl}}'''
| Left distributivity of {{tmath|*}} over {{tmath|+}}
|-
| {{math|∀ ''u'',''v'',''w'':}}
| align="right" | {{tmath|(v+w)*u}}
| {{=}}
| {{tmath|v*u+w*u}}
| align="center" | '''{{mvar|Dr}}'''
| Right distributivity of {{tmath|*}} over {{tmath|+}}
|-
| {{math|∀ ''u'':}}
| align="right" | {{tmath|u*u}}
| {{=}}
| {{mvar|u}}
| align="center" | '''{{mvar|I}}'''
| Idempotence of {{tmath|*}}
|-
| {{math|∀ ''u'':}}
| align="right" | {{tmath|n*u}}
| {{=}}
| {{mvar|u}}
| align="center" | '''{{mvar|Nl}}'''
| Left neutral element {{mvar|n}} with respect to {{tmath|*}}
|-
| {{math|∀ ''u'':}}
| align="right" | {{tmath|u*n}}
| {{=}}
| {{mvar|u}}
| align="center" |     '''{{mvar|Nr}}'''    
| Right neutral element {{mvar|n}} with respect to {{tmath|*}}
|}

It is said that ''unification is decidable'' for a theory, if a unification algorithm has been devised for it that terminates for ''any'' input problem.
It is said that ''unification is [[Decidable problem#Decidability|semi-decidable]]'' for a theory, if a unification algorithm has been devised for it that terminates for any ''solvable'' input problem, but may keep searching forever for solutions of an unsolvable input problem.

'''Unification is decidable''' for the following theories:
* '''{{mvar|A}}'''<ref>[[Gordon D. Plotkin]], ''Lattice Theoretic Properties of Subsumption'', Memorandum MIP-R-77, Univ. Edinburgh, Jun 1970</ref>
* '''{{mvar|A}}''','''{{mvar|C}}'''<ref>[[Mark E. Stickel]], ''A Unification Algorithm for Associative-Commutative Functions'', J. Assoc. Comput. Mach., vol.28, no.3, pp. 423–434, 1981</ref>
* '''{{mvar|A}}''','''{{mvar|C}}''','''{{mvar|I}}'''<ref name="Fages.1987">F. Fages, ''Associative-Commutative Unification'', J. Symbolic Comput., vol.3, no.3, pp. 257–275, 1987</ref>
* '''{{mvar|A}}''','''{{mvar|C}}''','''{{mvar|Nl}}'''<ref group=note name="LRequivC">in the presence of equality '''{{mvar|C}}''', equalities '''{{mvar|Nl}}''' and '''{{mvar|Nr}}''' are equivalent, similar for '''{{mvar|Dl}}''' and '''{{mvar|Dr}}'''</ref><ref name="Fages.1987"/>
* '''{{mvar|A}}''','''{{mvar|I}}'''<ref>Franz Baader, ''Unification in Idempotent Semigroups is of Type Zero'', J. Automat. Reasoning, vol.2, no.3, 1986</ref>
* '''{{mvar|A}}''','''{{mvar|Nl}}'''{{mvar|,}}'''{{mvar|Nr}}''' (monoid)<ref>J. Makanin, ''The Problem of Solvability of Equations in a Free Semi-Group'', Akad. Nauk SSSR, vol.233, no.2, 1977</ref>
* '''{{mvar|C}}'''<ref>{{cite journal| author=F. Fages| title=Associative-Commutative Unification| journal=J. Symbolic Comput.| year=1987| volume=3| number=3| pages=257–275| doi=10.1016/s0747-7171(87)80004-4}}</ref>
* [[Boolean ring]]s<ref>{{cite book| author=Martin, U., Nipkow, T.| chapter=Unification in Boolean Rings| title=Proc. 8th CADE| year=1986| volume=230| pages=506–513| publisher=Springer| editor=Jörg H. Siekmann| series=LNCS}}</ref><ref>{{cite journal|author1=A. Boudet |author2=J.P. Jouannaud |author3=M. Schmidt-Schauß | title=Unification of Boolean Rings and Abelian Groups| journal=Journal of Symbolic Computation| year=1989| volume=8| pages=449–477 |url=http://www.sciencedirect.com/science/article/pii/S0747717189800549/pdf?md5=713ed362e4b6f2db53923cc5ed47c818&pid=1-s2.0-S0747717189800549-main.pdf| doi=10.1016/s0747-7171(89)80054-9}}</ref>
* [[Abelian group]]s, even if the signature is expanded by arbitrary additional symbols (but not axioms)<ref name="Baader and Snyder 2001, p. 486">Baader and Snyder (2001), p. 486.</ref>
* [[Kripke semantics#Correspondence and completeness|K4]] [[modal algebra]]s<ref>F. Baader and S. Ghilardi, ''Unification in modal and description logics'', Logic Journal of the IGPL 19 (2011), no. 6, pp. 705–730.</ref>

'''Unification is semi-decidable''' for the following theories:
* '''{{mvar|A}}''','''{{mvar|Dl}}'''{{mvar|,}}'''{{mvar|Dr}}'''<ref>P. Szabo, ''Unifikationstheorie erster Ordnung'' (''First Order Unification Theory''), Thesis, Univ. Karlsruhe, West Germany, 1982</ref>
* '''{{mvar|A}}''','''{{mvar|C}}''','''{{mvar|Dl}}'''<ref group=note name="LRequivC"/><ref>Jörg H. Siekmann, ''Universal Unification'', Proc. 7th Int. Conf. on Automated Deduction, Springer LNCS vol.170, pp. 1–42, 1984</ref>
* [[Commutative ring]]s<ref name="Baader and Snyder 2001, p. 486"/>

===One-sided paramodulation===

If there is a [[Term rewriting#Termination and convergence|convergent term rewriting system]] ''R'' available for ''E'',
the '''one-sided paramodulation''' algorithm<ref>N. Dershowitz and G. Sivakumar, ''Solving Goals in Equational Languages'', Proc. 1st Int. Workshop on Conditional Term Rewriting Systems, Springer LNCS vol.308, pp. 45–55, 1988</ref>
can be used to enumerate all solutions of given equations.

{| style="border: 1px solid darkgray;"
|+ One-sided paramodulation rules
|- border="0"
| align="right" | ''G'' ∪ { ''f''(''s''1,...,''s''''n'') ≐ ''f''(''t''1,...,''t''''n'') }
| ; ''S''
| ⇒
| align="right" | ''G'' ∪ { ''s''1 ≐ ''t''1, ..., ''s''''n'' ≐ ''t''''n'' }
| ''; ''S''
|
|     '''decompose'''
|-
| align="right" | ''G'' ∪ { ''x'' ≐ ''t'' }
| ; ''S''
| ⇒
| align="right" | ''G'' { ''x'' ↦ ''t'' }
|; ''S''{''x''↦''t''} ∪ {''x''↦''t''}
| align="right" | if the variable ''x'' doesn't occur in ''t''
|     '''eliminate'''
|-
| align="right" | ''G'' ∪ { ''f''(''s''1,...,''s''''n'') ≐ ''t'' }
| ; ''S''
| ⇒
| align="right" | ''G'' ∪ { ''s''1 ≐ u1, ..., ''s''''n'' ≐ u''n'', ''r'' ≐ ''t'' }
| ; ''S''
| align="right" |     if ''f''(''u''1,...,''u''''n'') → ''r'' is a rule from ''R''
|     '''mutate'''
|-
| align="right" | ''G'' ∪ { ''f''(''s''1,...,''s''''n'') ≐ ''y'' }
| ; ''S''
|⇒
| align="right" | ''G'' ∪ { ''s''1 ≐ ''y''1, ..., ''s''''n'' ≐ ''y''''n'', ''y'' ≐ ''f''(''y''1,...,''y''''n'') }
| ; ''S''
| align="right" | if ''y''1,...,''y''''n'' are new variables
|     '''imitate'''
|}

Starting with ''G'' being the unification problem to be solved and ''S'' being the identity substitution, rules are applied nondeterministically until the empty set appears as the actual ''G'', in which case the actual ''S'' is a unifying substitution. Depending on the order the paramodulation rules are applied, on the choice of the actual equation from ''G'', and on the choice of ''R''’s rules in ''mutate'', different computations paths are possible. Only some lead to a solution, while others end at a ''G'' ≠ {} where no further rule is applicable (e.g. ''G'' = { ''f''(...) ≐ ''g''(...) }).

{| style="border: 1px solid darkgray;"
|+ Example term rewrite system ''R''
|- border="0"
| '''1'''
| ''app''(''nil'',''z'')
| → ''z''
|-
|'''2'''    
| ''app''(''x''.''y'',''z'')
| → ''x''.''app''(''y'',''z'')
|}

For an example, a term rewrite system ''R'' is used defining the ''append'' operator of lists built from ''cons'' and ''nil''; where ''cons''(''x'',''y'') is written in infix notation as ''x''.''y'' for brevity; e.g. ''app''(''a''.''b''.''nil'',''c''.''d''.''nil'') → ''a''.''app''(''b''.''nil'',''c''.''d''.''nil'') → ''a''.''b''.''app''(''nil'',''c''.''d''.''nil'') → ''a''.''b''.''c''.''d''.''nil'' demonstrates the concatenation of the lists ''a''.''b''.''nil'' and ''c''.''d''.''nil'', employing the rewrite rule 2,2, and 1. The equational theory ''E'' corresponding to ''R'' is the [[Closure (mathematics)#P closures of binary relations|congruence closure]] of ''R'', both viewed as binary relations on terms.
For example, ''app''(''a''.''b''.''nil'',''c''.''d''.''nil'') ≡ ''a''.''b''.''c''.''d''.''nil'' ≡ ''app''(''a''.''b''.''c''.''d''.''nil'',''nil''). The paramodulation algorithm enumerates solutions to equations with respect to that ''E'' when fed with the example ''R''.

A successful example computation path for the unification problem { ''app''(''x'',''app''(''y'',''x'')) ≐ ''a''.''a''.''nil'' } is shown below. To avoid variable name clashes, rewrite rules are consistently renamed each time before their use by rule ''mutate''; ''v''2, ''v''3, ... are computer-generated variable names for this purpose. In each line, the chosen equation from ''G'' is highlighted in red. Each time the ''mutate'' rule is applied, the chosen rewrite rule (''1'' or ''2'') is indicated in parentheses. From the last line, the unifying substitution ''S'' = { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' } can be obtained. In fact,
''app''(''x'',''app''(''y'',''x'')) {''y''↦''nil'', ''x''↦ ''a''.''nil'' } = ''app''(''a''.''nil'',''app''(''nil'',''a''.''nil'')) ≡ ''app''(''a''.''nil'',''a''.''nil'') ≡ ''a''.''app''(''nil'',''a''.''nil'') ≡ ''a''.''a''.''nil'' solves the given problem.
A second successful computation path, obtainable by choosing "mutate(1), mutate(2), mutate(2), mutate(1)" leads to the substitution ''S'' = { ''y'' ↦ ''a''.''a''.''nil'', ''x'' ↦ ''nil'' }; it is not shown here. No other path leads to a success.

{| class="wikitable"
|+ Example unifier computation
|-
! Used rule !! !! ''G'' !! ''S''
|-
| ||
| { {{color|red|''app''(''x'',''app''(''y'',''x'')) ≐ ''a''.''a''.''nil''}} }
| {}
|-
| mutate(2) || ⇒
| { ''x'' ≐ ''v''2.''v''3, ''app''(''y'',''x'') ≐ ''v''4, {{color|red|''v''2.''app''(''v''3,''v''4) ≐ ''a''.''a''.''nil''}} }
| {}
|-
| decompose || ⇒
| { {{color|red|''x'' ≐ ''v''2.''v''3}}, ''app''(''y'',''x'') ≐ ''v''4, ''v''2 ≐ ''a'', ''app(''v''3,''v''4) ≐ ''a''.''ni''l }
| {}
|-
| eliminate || ⇒
| { ''app''(''y'',''v''2.''v''3) ≐ ''v''4, {{color|red|''v''2 ≐ ''a''}}, ''app''(''v''3,''v''4) ≐ ''a''.''nil'' }
| { ''x'' ↦ ''v''2.''v''3 }
|-
| eliminate || ⇒
| { {{color|red|''app''(''y'',''a''.''v''3) ≐ ''v''4}}, ''app''(''v''3,''v''4) ≐ ''a''.''nil'' }
| { ''x'' ↦ ''a''.''v''3 }
|-
| mutate(1) || ⇒
| { ''y'' ≐ ''nil'', ''a''.''v''3 ≐ ''v''5, {{color|red|''v''5 ≐ ''v''4}}, ''app''(''v''3,''v''4) ≐ ''a''.''nil'' }
| { ''x'' ↦ ''a''.''v''3 }
|-
| eliminate || ⇒
| { {{color|red|''y'' ≐ ''nil''}}, ''a''.''v''3 ≐ ''v''4, ''app''(''v''3,''v''4) ≐ ''a''.''nil'' }
| { ''x'' ↦ ''a''.''v''3 }
|-
| eliminate || ⇒
| { ''a''.''v''3 ≐ ''v''4, {{color|red|''app''(''v''3,''v''4) ≐ ''a''.''nil''}} }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''v''3 }
|-
| mutate(1) || ⇒
| { ''a''.''v''3 ≐ ''v''4, ''v''3 ≐ ''nil'', {{color|red|''v''4 ≐ ''v''6}}, ''v''6 ≐ ''a''.''nil'' }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''v''3 }
|-
| eliminate || ⇒
| { ''a''.''v''3 ≐ ''v''4, {{color|red|''v''3 ≐ ''nil''}}, ''v''4 ≐ ''a''.''nil'' }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''v''3 }
|-
| eliminate || ⇒
| { ''a''.''nil'' ≐ ''v''4, {{color|red|''v''4 ≐ ''a''.''nil''}} }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' }
|-
| eliminate || ⇒
| { {{color|red|''a''.''nil'' ≐ ''a''.''nil''}} }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' }
|-
| decompose || ⇒
| { {{color|red|''a'' ≐ ''a''}}, ''nil'' ≐ ''nil'' }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' }
|-
| decompose || ⇒
| { {{color|red|''nil'' ≐ ''nil''}} }
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' }
|-
| decompose     || ⇒    
| {}
| { ''y'' ↦ ''nil'', ''x'' ↦ ''a''.''nil'' }
|}

===Narrowing===

[[File:Triangle diagram of narrowing step svg.svg|thumb|Triangle diagram of narrowing step ''s'' ~› ''t'' at position ''p'' in term ''s'', with unifying substitution σ (bottom row), using a rewrite rule {{math|1=''l'' → ''r''}} (top row)]]
If ''R'' is a [[Term rewriting#Termination and convergence|convergent term rewriting system]] for ''E'',
an approach alternative to the previous section consists in successive application of "'''narrowing steps'''";
this will eventually enumerate all solutions of a given equation.
A narrowing step (cf. picture) consists in
* choosing a nonvariable subterm of the current term,
* [[#Syntactic unification of first-order terms|syntactically unifying]] it with the left hand side of a rule from ''R'', and
* replacing the instantiated rule's right hand side into the instantiated term.
Formally, if {{math|''l'' → ''r''}} is a [[Term (logic)#Structural equality|renamed copy]] of a rewrite rule from ''R'', having no variables in common with a term ''s'', and the [[Term (logic)#Operations with terms|subterm]] {{math|''s''{{!}}''p''}} is not a variable and is unifiable with {{mvar|l}} via the [[#Syntactic unification of first-order terms|mgu]] {{mvar|σ}}, then {{mvar|s}} can be '''narrowed''' to the term {{math|1=''t'' = ''sσ''[''rσ'']''p''}}, i.e. to the term {{mvar|sσ}}, with the subterm at ''p'' [[Term (logic)#Operations with terms|replaced]] by {{mvar|rσ}}. The situation that ''s'' can be narrowed to ''t'' is commonly denoted as ''s'' ~› ''t''.
Intuitively, a sequence of narrowing steps ''t''1 ~› ''t''2 ~› ... ~› ''t''''n'' can be thought of as a sequence of rewrite steps ''t''1 → ''t''2 → ... → ''t''''n'', but with the initial term ''t''1 being further and further instantiated, as necessary to make each of the used rules applicable.

The [[#One-sided paramodulation|above]] example paramodulation computation corresponds to the following narrowing sequence ("↓" indicating instantiation here):

{|
|-
| ''app''( || ''x'' || ,''app''(''y'', || ''x'' || ))
|-
| || ↓ || || ↓ || || || || || || || || || || || || || || ''x'' ↦ ''v''2.''v''3
|-
| ''app''( || ''v''2.''v''3 || ,''app''(''y'', || ''v''2.''v''3 || )) || → || ''v''2.''app''(''v''3,''app''( || ''y'' || ,''v''2.''v''3))
|-
| || || || || || || || ↓ || || || || || || || || || || ''y'' ↦ ''nil''
|-
| || || || || || || ''v''2.''app''(''v''3,''app''( || ''nil'' || ,''v''2.''v''3)) || → || ''v''2.''app''( || ''v''3 || ,''v''2. || ''v''3 || )
|-
| || || || || || || || || || || || ↓ || || ↓ || || || || ''v''3 ↦ ''nil''
|-
| || || || || || || || || || || ''v''2.''app''( || ''nil'' || ,''v''2. || ''nil'' || ) || → || ''v''2.''v''2.''nil''
|}

The last term, ''v''2.''v''2.''nil'' can be syntactically unified with the original right hand side term ''a''.''a''.''nil''.

The ''narrowing lemma''<ref>{{cite book| author=Fay| chapter=First-Order Unification in an Equational Theory| title=Proc. 4th Workshop on Automated Deduction| year=1979| pages=161–167}}</ref> ensures that whenever an instance of a term ''s'' can be rewritten to a term ''t'' by a convergent term rewriting system, then ''s'' and ''t'' can be narrowed and rewritten to a term {{math|1=''s''’}} and {{math|1=''t''’}}, respectively, such that {{math|1=''t''’}} is an instance of {{math|1=''s''’}}.

Formally: whenever {{math|1=''sσ'' {{underset|&lowast;|→}} ''t''}} holds for some substitution σ, then there exist terms {{math|''s''’, ''t''’}} such that {{math|''s'' {{underset|&lowast;|~›}} ''s''’}} and {{math|''t'' {{underset|&lowast;|→}} ''t''’}} and {{math|1=''s''’''τ'' = ''t''’}} for some substitution τ.

==Higher-order unification==

Many applications require one to consider the unification of typed lambda-terms instead of first-order terms. Such unification is often called ''higher-order unification''. A well studied branch of higher-order unification is the problem of unifying simply typed lambda terms modulo the equality determined by αβη conversions. Such unification problems do not have most general unifiers. While higher-order unification is [[Undecidable problem|undecidable]],<ref>{{cite journal| author=Warren D. Goldfarb| authorlink=Warren D. Goldfarb| title=The Undecidability of the Second-Order Unification Problem| journal=TCS| year=1981| volume=13| pages=225–230| url=http://www.sciencedirect.com/science/article/pii/0304397581900402/pdf?md5=ebe7687d034498bb76c4ea9c5df56f84&pid=1-s2.0-0304397581900402-main.pdf| doi=10.1016/0304-3975(81)90040-2}}</ref><ref>{{cite journal| author=Gérard P. Huet| title=The Undecidability of Unification in Third Order Logic| journal=Information and Control| year=1973| volume=22| pages=257–267 |url=http://www.sciencedirect.com/science/article/pii/S001999587390301X/pdf?md5=0833289609c3d777bdec01d5d6ced2aa&pid=1-s2.0-S001999587390301X-main.pdf |doi=10.1016/S0019-9958(73)90301-X}}</ref><ref>Claudio Lucchesi: The Undecidability of the Unification Problem for Third Order Languages (Research Report CSRR 2059; Department of Computer Science, University of Waterloo, 1972)</ref> [[Gérard Huet]] gave a [[semi-decidable]] (pre-)unification algorithm<ref>Gérard Huet: A Unification Algorithm for typed Lambda-Calculus []</ref> that allows a systematic search of the space of unifiers (generalizing the unification algorithm of Martelli-Montanari<ref name="Martelli.Montanari.1982"/> with rules for terms containing higher-order variables) that seems to work sufficiently well in practice. Huet<ref>[http://portal.acm.org/citation.cfm?id=695200 Gérard Huet: Higher Order Unification 30 Years Later]</ref> and Gilles Dowek<ref>Gilles Dowek: Higher-Order Unification and Matching. Handbook of Automated Reasoning 2001: 1009–1062</ref> have written articles surveying this topic.

[[Dale Miller (computer scientist)|Dale Miller]] has described what is now called [[higher-order pattern unification]].<ref>{{cite journal|first1=Dale|last1=Miller|title=A Logic Programming Language with Lambda-Abstraction, Function Variables, and Simple Unification|journal=Journal of Logic and Computation|year=1991|pages=497–536|url=http://www.lix.polytechnique.fr/Labo/Dale.Miller/papers/jlc91.pdf}}</ref> This subset of higher-order unification is decidable and solvable unification problems have most-general unifiers. Many computer systems that contain higher-order unification, such as the higher-order logic programming languages [[λProlog]] and [[Twelf]], often implement only the pattern fragment and not full higher-order unification.

In computational linguistics, one of the most influential theories of [[Elliptical construction|ellipsis]] is that ellipses are represented by free variables whose values are then determined using Higher-Order Unification (HOU). For instance, the semantic representation of "Jon likes Mary and Peter does too" is <math> like(j; m) \land R(p) </math> and the value of R (the semantic representation of the ellipsis) is determined by the equation <math> like(j; m) = R(j) </math>. The process of solving such equations is called Higher-Order Unification.<ref>{{cite book| first1 = Claire | last1 = Gardent | first2 = Michael | last2 = Kohlhase | first3 = Karsten | last3 = Konrad | author2link=Michael Kohlhase| chapter=A Multi-Level, Higher-Order Unification Approach to Ellipsis| title=Submitted to European [[Association for Computational Linguistics]] (EACL)| year=1997| volume=| pages=| publisher=| editor=| series=|citeseerx = 10.1.1.55.9018}}</ref>

For example, the unification problem { ''f''(''a'', ''b'', ''a'') ≐ ''d''(''b'', ''a'', ''c'') }, where the only variable is ''f'', has the
solutions {''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''y'', ''x'', ''c'') }, {''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''y'', ''z'', ''c'') },
{''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''y'', ''a'', ''c'') }, {''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''b'', ''x'', ''c'') },
{''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''b'', ''z'', ''c'') } and {''f'' ↦ λ''x''.λ''y''.λ''z''.''d''(''b'', ''a'', ''c'') }.

[[Wayne Snyder]] gave a generalization of both higher-order unification and E-unification, i.e. an algorithm to unify lambda-terms modulo an equational theory.<ref>{{cite book | author=Wayne Snyder | contribution=Higher order E-unification | title=Proc. 10th [[Conference on Automated Deduction]] | publisher=Springer | series=LNAI | volume=449 | pages=573–587 |date=Jul 1990 }}</ref>

==See also==
*[[Rewriting]]
*[[Admissible rule]]
*[[Explicit substitution]] in [[lambda calculus]]
* Mathematical [[Equation solving]]
* [[Dis-unification (computer science)|Dis-unification]]: solving inequations between symbolic expression
* [[Anti-unification (computer science)|Anti-unification]]: computing a least general generalization (lgg) of two terms, dual to computing a most general instance (mgu)
* [[Ontology alignment]] (use ''unification'' with [[semantic equivalence]])

==Notes==
{{Reflist|group=note}}

==References==
{{reflist}}

== Further reading ==
* [[Franz Baader]] and [[Wayne Snyder]] (2001). [http://www.cs.bu.edu/~snyder/publications/UnifChapter.pdf "Unification Theory"]. In [[John Alan Robinson]] and [[Andrei Voronkov]], editors, ''[[Handbook of Automated Reasoning]]'', volume I, pages 447–533. Elsevier Science Publishers.
* [[Gilles Dowek]] (2001). [https://who.rocq.inria.fr/Gilles.Dowek/Publi/unification.ps "Higher-order Unification and Matching"]. In ''Handbook of Automated Reasoning''.
* Franz Baader and [[Tobias Nipkow]] (1998). [http://www.in.tum.de/~nipkow/TRaAT/ ''Term Rewriting and All That'']. Cambridge University Press.
* Franz Baader and [[Jörg H. Siekmann]] (1993). "Unification Theory". In ''Handbook of Logic in Artificial Intelligence and Logic Programming''.
* Jean-Pierre Jouannaud and [[Claude Kirchner]] (1991). "Solving Equations in Abstract Algebras: A Rule-Based Survey of Unification". In ''Computational Logic: Essays in Honor of Alan Robinson''.
* [[Nachum Dershowitz]] and [[Jean-Pierre Jouannaud]], ''Rewrite Systems'', in: [[Jan van Leeuwen]] (ed.), ''[[Handbook of Theoretical Computer Science]]'', volume B ''Formal Models and Semantics'', Elsevier, 1990, pp. 243–320
* Jörg H. Siekmann (1990). "Unification Theory". In [[Claude Kirchner]] (editor) ''Unification''. Academic Press.
* {{cite journal| author=Kevin Knight| title=Unification: A Multidisciplinary Survey| journal=ACM Computing Surveys|date=Mar 1989| volume=21| number=1| pages=93–124| url=http://www.isi.edu/natural-language/people/unification-knight.pdf| doi=10.1145/62029.62030}}
* [[Gérard Huet]] and [[Derek C. Oppen]] (1980). [http://infolab.stanford.edu/pub/cstr/reports/cs/tr/80/785/CS-TR-80-785.pdf "Equations and Rewrite Rules: A Survey"]. Technical report. Stanford University.
* {{cite journal | last1 = Raulefs | first1 = Peter | last2 = Siekmann | first2 = Jörg | last3 = Szabó | first3 = P. | last4 = Unvericht | first4 = E. | year = 1979 | title = A short survey on the state of the art in matching and unification problems | url = | journal = ACM SIGSAM Bulletin | volume = 13 | issue = 2 }}
* Claude Kirchner and Hélène Kirchner. ''Rewriting, Solving, Proving''. In preparation.

[[Category:Automated theorem proving]]
[[Category:Logic programming]]
[[Category:Rewriting systems]]
[[Category:Logic in computer science]]
[[Category:Type theory]]
[[Category:Unification (computer science)| ]]

Prime ideal

2017-02-02T21:49:05Z

Magmalex: /* Prime ideals for noncommutative rings */ added inline reference

{{about|ideals in [[ring theory]]|prime ideals in order theory|ideal (order theory)#Prime ideals}}
[[File:A portion of the lattice of ideals of Z illustrating prime, semiprime and primary ideals SVG.svg|A portion of the lattice of ideals of Z illustrating prime, semiprime and primary ideals|thumb|right|320px|A [[Hasse diagram]] of a portion of the lattice of ideals of the integers {{math|'''Z'''}}. The purple nodes indicate prime ideals. The purple and green nodes are [[semiprime ideal]]s, and the purple and blue nodes are [[primary ideal]]s.]] In [[algebra]], a '''prime ideal''' is a [[subset]] of a [[ring (mathematics)|ring]] that shares many important properties of a [[prime number]] in the [[ring of integers]].<ref>{{cite book | last1=Dummit | first1=David S. | last2=Foote | first2=Richard M. | title=Abstract Algebra | publisher=[[John Wiley & Sons]] | year=2004 | edition=3rd | isbn=0-471-43334-9}}</ref><ref>{{cite book | last=Lang | first=Serge | authorlink=Serge Lang | title=Algebra | publisher=[[Springer Science+Business Media|Springer]] | series=[[Graduate Texts in Mathematics]] | year=2002 | isbn=0-387-95385-X}}</ref> The prime ideals for the integers are the sets that contain all the multiples of a given prime number, together with the [[zero ideal]].

[[Primitive ideal]]s are prime, and prime ideals are both [[primary ideal|primary]] and [[semiprime ideal|semiprime]].

==Prime ideals for commutative rings==
An [[ideal (ring theory)|ideal]] {{mvar|P}} of a [[commutative ring]] {{mvar|R}} is '''prime''' if it has the following two properties:

* If {{mvar|a}} and {{mvar|b}} are two elements of {{mvar|R}} such that their product {{math|''ab''}} is an element of {{mvar|P}}, then {{math|''a''}} is in {{mvar|P}} or {{math|''b''}} is in {{mvar|P}},
* {{mvar|P}} is not equal to {{mvar|R}} for the whole ring.

This generalizes the following property of prime numbers: if {{math|''p''}} is a prime number and if {{math|''p''}} divides a product {{math|''ab''}} of two [[integer]]s, then {{math|''p''}} divides {{math|''a''}} or {{math|''p''}} divides {{math|''b''}}. We can therefore say

:A positive integer {{mvar|n}} is a prime number if and only if {{math|''n'''''Z'''}} is a prime ideal in {{math|'''Z'''}}.

===Examples===
* A simple example: For {{mvar|R}}={{math|'''Z'''}}, the set of even numbers is a prime ideal.
* If {{mvar|R}} denotes the ring {{math|'''C'''[''X'', ''Y'']}} of [[polynomial]]s in two variables with [[complex number|complex]] coefficients, then the ideal generated by the polynomial {{math|''Y'' 2 − ''X'' 3 − ''X'' − 1}} is a prime ideal (see [[elliptic curve]]).
* In the ring {{math|'''Z'''[''X'']}} of all polynomials with integer coefficients, the ideal generated by {{math|2}} and {{mvar|X}} is a prime ideal. It consists of all those polynomials whose constant coefficient is even.
* In any ring {{mvar|R}}, a '''[[maximal ideal]]''' is an ideal {{mvar|M}} that is [[maximal element|maximal]] in the set of all proper ideals of {{mvar|R}}, i.e. {{mvar|M}} is [[subset|contained in]] exactly two ideals of {{mvar|R}}, namely {{mvar|M}} itself and the entire ring {{mvar|R}}. Every maximal ideal is in fact prime. In a [[principal ideal domain]] every nonzero prime ideal is maximal, but this is not true in general.
* If {{mvar|M}} is a smooth [[manifold]], {{mvar|R}} is the ring of smooth real functions on {{mvar|M}}, and {{mvar|x}} is a point in {{mvar|M}}, then the set of all smooth functions {{mvar|f}} with {{math|''f'' (''x'') {{=}} 0}} forms a prime ideal (even a maximal ideal) in {{mvar|R}}.

===Properties===
* An ideal {{math|''I''}} in the ring {{mvar|R}} (with unity) is prime if and only if the factor ring {{math|''R''/''I''}} is an [[integral domain]]. In particular, a commutative ring is an integral domain if and only if {{math|(0) }} is a prime ideal.
* An ideal {{math|''I''}} is prime if and only if its set-theoretic complement is [[multiplicatively closed set|multiplicatively closed]].<ref>{{cite book | last=Reid | first=Miles | authorlink=Miles Reid | title=Undergraduate Commutative Algebra | publisher=[[Cambridge University Press]] | year=1996 | isbn=0-521-45889-7}}</ref>
* Every nonzero ring contains at least one prime ideal (in fact it contains at least one maximal ideal), which is a direct consequence of [[Krull's theorem]].
* The set of all prime ideals (the spectrum of a ring) contains minimal elements (called [[minimal prime (commutative algebra)|minimal prime]]). Geometrically, these correspond to irreducible components of the spectrum.
* The [[preimage]] of a prime ideal under a ring homomorphism is a prime ideal.
* The sum of two prime ideals is not necessarily prime. For an example, consider the ring {{math|'''C'''[''x'', ''y'']}} with prime ideals {{math|''P'' {{=}} (''x''2 + ''y''2 − 1)}} and {{math|''Q'' {{=}} (''x'')}} (the ideals generated by {{math|''x''2 + ''y''2 − 1}} and ''x'' respectively). Their sum {{math|''P'' + ''Q'' {{=}} (''x''2 + ''y''2 − 1, ''x'') {{=}} (''y''2 − 1, ''x'')}} however is not prime: {{math|''y''2 − 1 {{=}} (''y'' − 1)(''y'' + 1) ∈ ''P'' + ''Q''}} but its two factors are not. Alternatively, note that the quotient ring has zero divisors so it is not an integral domain and thus {{math|''P'' + ''Q''}} cannot be prime.
* In a commutative ring {{mvar|R}} with at least two elements, if every proper ideal is prime, then the ring is a field. (If the ideal {{math|(0)}} is prime, then the ring {{mvar|R}} is an integral domain. If {{mvar|q}} is any non-zero element of {{mvar|R}} and the ideal {{math|(''q''2)}} is prime, then it contains {{mvar|q}} and then {{mvar|q}} is invertible.)
* A nonzero principal ideal is prime if and only if it is generated by a [[prime element]]. In a UFD, every nonzero prime ideal contains a prime element.

===Uses===
One use of prime ideals occurs in [[algebraic geometry]], where varieties are defined as the zero sets of ideals in polynomial rings. It turns out that the irreducible varieties correspond to prime ideals. In the modern abstract approach, one starts with an arbitrary commutative ring and turns the set of its prime ideals, also called its [[spectrum of a ring|spectrum]], into a [[topological space]] and can thus define generalizations of varieties called [[scheme (mathematics)|schemes]], which find applications not only in [[geometry]], but also in [[number theory]].

The introduction of prime ideals in [[algebraic number theory]] was a major step forward: it was realized that the important property of unique factorisation expressed in the [[fundamental theorem of arithmetic]] does not hold in every ring of [[algebraic integer]]s, but a substitute was found when [[Richard Dedekind]] replaced elements by ideals and prime elements by prime ideals; see [[Dedekind domain]].

==Prime ideals for noncommutative rings==
The notion of a prime ideal can be generalized to noncommutative rings by using the commutative definition "ideal-wise". [[Wolfgang Krull]] advanced this idea in 1928.<ref>Krull, Wolfgang, ''Primidealketten in allgemeinen Ringbereichen'', Sitzungsberichte Heidelberg. Akad. Wissenschaft (1928), 7. Abhandl.,3-14.</ref> The following content can be found in texts such as Goodearl's <ref>Goodearl, ''An Introduction to Noncommutative Noetherian Rings''</ref> and Lam's <ref>Lam, ''First Course in Noncommutative Rings''</ref>. If {{mvar|R}} is a (possibly noncommutative) ring and {{mvar|P}} is an ideal in {{mvar|R}} other than {{mvar|R}} itself, we say that {{mvar|P}} is '''prime''' if for any two ideals {{mvar|A}} and {{mvar|B}} of {{mvar|R}}:

* If the product of ideals {{math|''AB''}} is contained in {{mvar|P}}, then at least one of {{mvar|A}} and {{mvar|B}} is contained in {{mvar|P}}.

It can be shown that this definition is equivalent to the commutative one in commutative rings. It is readily verified that if an ideal of a noncommutative ring {{mvar|R}} satisfies the commutative definition of prime, then it also satisfies the noncommutative version. An ideal {{mvar|P}} satisfying the commutative definition of prime is sometimes called a '''completely prime ideal''' to distinguish it from other merely prime ideals in the ring. Completely prime ideals are prime ideals, but the converse is not true. For example, the zero ideal in the ring of {{math|''n'' × ''n''}} matrices over a field is a prime ideal, but it is not completely prime.

This is close to the historical point of view of ideals as [[ideal number]]s, as for the ring {{math|'''Z'''}} "{{mvar|A}} is contained in {{mvar|P}}" is another way of saying "{{mvar|P}} divides {{mvar|A}}", and the unit ideal {{mvar|R}} represents unity.

Equivalent formulations of the ideal {{math|''P'' ≠ ''R''}} being prime include the following properties:
* For all {{mvar|a}} and {{mvar|b}} in {{mvar|R}}, {{math|(''a'')(''b'') ⊆ ''P''}} implies {{math|''a'' ∈ ''P''}} or {{math|''b'' ∈ ''P''}}.
* For any two ''right'' ideals of {{mvar|R}}, {{math|''AB'' ⊆ ''P''}} implies {{math|''A'' ⊆ ''P''}} or {{math|''B'' ⊆ ''P''}}.
* For any two ''left'' ideals of {{mvar|R}}, {{math|''AB'' ⊆ ''P''}} implies {{math|''A'' ⊆ ''P''}} or {{math|''B'' ⊆ ''P''}}.
* For any elements {{mvar|a}} and {{mvar|b}} of {{mvar|R}}, if {{math|''aRb'' ⊆ ''P''}}, then {{math|''a'' ∈ ''P''}} or {{math|''b'' ∈ ''P''}}.

Prime ideals in commutative rings are characterized by having [[multiplicatively closed subset|multiplicatively closed]] complements in {{mvar|R}}, and with slight modification, a similar characterization can be formulated for prime ideals in noncommutative rings. A nonempty subset {{math|''S'' ⊆ ''R''}} is called an '''m-system''' if for any {{mvar|a}} and {{mvar|b}} in {{mvar|S}}, there exists {{mvar|r}} in {{mvar|R}} such that ''arb'' is in {{mvar|S}}.<ref>Obviously, multiplicatively closed sets are m-systems.</ref> The following item can then be added to the list of equivalent conditions above:

* The complement {{math|''R''\''P''}} is an m-system.

===Examples===
* Any [[primitive ideal]] is prime.
* As with commutative rings, maximal ideals are prime, and also prime ideals contain minimal prime ideals.
* A ring is a [[prime ring]] if and only if the zero ideal is a prime ideal, and moreover a ring is a [[integral domain|domain]] if and only if the zero ideal is a completely prime ideal.
* Another fact from commutative theory echoed in noncommutative theory is that if {{mvar|A}} is a nonzero {{mvar|R}} module, and {{mvar|P}} is a maximal element in the [[poset]] of [[Annihilator (ring theory)|annihilator]] ideals of submodules of {{mvar|A}}, then {{mvar|P}} is prime.

==Important facts==
*'''[[Prime avoidance lemma]].''' If {{mvar|R}} is a commutative ring, and {{mvar|A}} is a subring (possibly without unity), and {{math|''I''1, ..., ''In''}} is a collection of ideals of {{mvar|R}} with at most two members not prime, then if {{mvar|A}} is not contained in any {{math|''Ij''}}, it is also not contained in the [[union (set theory)|union]] of {{math|''I''1, ..., ''In''}}.<ref>Jacobson ''Basic Algebra II'', p. 390</ref> In particular, {{mvar|A}} could be an ideal of {{mvar|R}}.
* If {{mvar|S}} is any m-system in {{mvar|R}}, then a lemma essentially due to Krull shows that there exists an ideal of {{mvar|R}} maximal with respect to being disjoint from {{mvar|S}}, and moreover the ideal must be prime.<ref>Lam ''First Course in Noncommutative Rings'', p. 156</ref> In the case {{math|{''S''} {{=}} {1},}} we have [[Krull's theorem]], and this recovers the maximal ideals of {{mvar|R}}. Another prototypical m-system is the set, {{math|{''x'', ''x''2, ''x''3, ''x''4, ...},}} of all positive powers of a non-[[nilpotent]] element.
* For a prime ideal {{mvar|P}}, the complement {{math|''R''\''P''}} has another property beyond being an m-system. If ''xy'' is in {{math|''R''\''P''}}, then both {{mvar|x}} and {{mvar|y}} must be in {{math|''R''\''P''}}, since {{mvar|P}} is an ideal. A set that contains the divisors of its elements is called '''saturated'''.
* For a commutative ring {{mvar|R}}, there is a kind of converse for the previous statement: If {{mvar|S}} is any nonempty saturated and multiplicatively closed subset of {{mvar|R}}, the complement {{math|''R''\''S''}} is a union of prime ideals of {{mvar|R}}.<ref>Kaplansky ''Commutative rings'', p. 2</ref>
*The intersection of members of a descending chain of prime ideals is a prime ideal, and in a commutative ring the union of members of an ascending chain of prime ideals is a prime ideal. With [[Zorn's Lemma]], these observations imply that the poset of prime ideals of a commutative ring (partially ordered by inclusion) has maximal and minimal elements.

==Connection to maximality==
Prime ideals can frequently be produced as maximal elements of certain collections of ideals. For example:
* An ideal maximal with respect to having empty intersection with a fixed m-system is prime.
* An ideal maximal among [[Annihilator (ring theory)|annihilators]] of submodules of a fixed {{mvar|R}} module {{mvar|M}} is prime.
* In a commutative ring, an ideal maximal with respect to being non-principal is prime.<ref>Kaplansky ''Commutative rings'', p. 10, Ex 10.</ref>
* In a commutative ring, an ideal maximal with respect to being not countably generated is prime.<ref>Kaplansky ''Commutative rings'', p. 10, Ex 11.</ref>

==References==
{{Reflist}}

==Further reading==

*{{citation|author1=Goodearl, K. R.|author2=Warfield, R. B., Jr.|title=An introduction to noncommutative Noetherian rings| series=London Mathematical Society Student Texts |volume=61 |edition=2 |publisher=Cambridge University Press | place=Cambridge |year=2004 |pages=xxiv+344|isbn=0-521-54537-4 |mr=2080008 }}
*{{citation|author=Jacobson, Nathan|title=Basic algebra. II|edition=2|publisher=W. H. Freeman and Company |place=New York|year=1989 |pages=xviii+686 |isbn=0-7167-1933-9 |mr=1009787}}
*{{citation|author=Kaplansky, Irving |title=Commutative rings |publisher=Allyn and Bacon Inc. |place=Boston, Mass.|year=1970 |pages=x+180 |mr=0254021 }}
*{{citation|author=Lam, T. Y.|authorlink=Tsit Yuen Lam |title=A first course in noncommutative rings |series=Graduate Texts in Mathematics|volume=131 |edition=2nd |publisher=Springer-Verlag |place=New York |year=2001 |pages=xx+385 |isbn=0-387-95183-0 |mr=1838439 | zbl=0980.16001 }}
*{{citation|author1=Lam, T. Y. | author1-link=Tsit Yuen Lam |author2=Reyes, Manuel L. |title=A prime ideal principle in commutative algebra |journal=J. Algebra |volume=319 |year=2008 |number=7 |pages=3006–3027 |issn=0021-8693 |mr=2397420 | zbl=1168.13002 |doi=10.1016/j.jalgebra.2007.07.016}}
* {{springer|title=Prime ideal|id=p/p074510}}

{{DEFAULTSORT:Prime Ideal}}
[[Category:Prime ideals| ]]

Prime ideal

2017-02-02T19:05:36Z

Magmalex: /* Prime ideals for noncommutative rings */ added inline reference

{{about|ideals in [[ring theory]]|prime ideals in order theory|ideal (order theory)#Prime ideals}}
[[File:A portion of the lattice of ideals of Z illustrating prime, semiprime and primary ideals SVG.svg|A portion of the lattice of ideals of Z illustrating prime, semiprime and primary ideals|thumb|right|320px|A [[Hasse diagram]] of a portion of the lattice of ideals of the integers {{math|'''Z'''}}. The purple nodes indicate prime ideals. The purple and green nodes are [[semiprime ideal]]s, and the purple and blue nodes are [[primary ideal]]s.]] In [[algebra]], a '''prime ideal''' is a [[subset]] of a [[ring (mathematics)|ring]] that shares many important properties of a [[prime number]] in the [[ring of integers]].<ref>{{cite book | last1=Dummit | first1=David S. | last2=Foote | first2=Richard M. | title=Abstract Algebra | publisher=[[John Wiley & Sons]] | year=2004 | edition=3rd | isbn=0-471-43334-9}}</ref><ref>{{cite book | last=Lang | first=Serge | authorlink=Serge Lang | title=Algebra | publisher=[[Springer Science+Business Media|Springer]] | series=[[Graduate Texts in Mathematics]] | year=2002 | isbn=0-387-95385-X}}</ref> The prime ideals for the integers are the sets that contain all the multiples of a given prime number, together with the [[zero ideal]].

[[Primitive ideal]]s are prime, and prime ideals are both [[primary ideal|primary]] and [[semiprime ideal|semiprime]].

==Prime ideals for commutative rings==
An [[ideal (ring theory)|ideal]] {{mvar|P}} of a [[commutative ring]] {{mvar|R}} is '''prime''' if it has the following two properties:

* If {{mvar|a}} and {{mvar|b}} are two elements of {{mvar|R}} such that their product {{math|''ab''}} is an element of {{mvar|P}}, then {{math|''a''}} is in {{mvar|P}} or {{math|''b''}} is in {{mvar|P}},
* {{mvar|P}} is not equal to {{mvar|R}} for the whole ring.

This generalizes the following property of prime numbers: if {{math|''p''}} is a prime number and if {{math|''p''}} divides a product {{math|''ab''}} of two [[integer]]s, then {{math|''p''}} divides {{math|''a''}} or {{math|''p''}} divides {{math|''b''}}. We can therefore say

:A positive integer {{mvar|n}} is a prime number if and only if {{math|''n'''''Z'''}} is a prime ideal in {{math|'''Z'''}}.

===Examples===
* A simple example: For {{mvar|R}}={{math|'''Z'''}}, the set of even numbers is a prime ideal.
* If {{mvar|R}} denotes the ring {{math|'''C'''[''X'', ''Y'']}} of [[polynomial]]s in two variables with [[complex number|complex]] coefficients, then the ideal generated by the polynomial {{math|''Y'' 2 − ''X'' 3 − ''X'' − 1}} is a prime ideal (see [[elliptic curve]]).
* In the ring {{math|'''Z'''[''X'']}} of all polynomials with integer coefficients, the ideal generated by {{math|2}} and {{mvar|X}} is a prime ideal. It consists of all those polynomials whose constant coefficient is even.
* In any ring {{mvar|R}}, a '''[[maximal ideal]]''' is an ideal {{mvar|M}} that is [[maximal element|maximal]] in the set of all proper ideals of {{mvar|R}}, i.e. {{mvar|M}} is [[subset|contained in]] exactly two ideals of {{mvar|R}}, namely {{mvar|M}} itself and the entire ring {{mvar|R}}. Every maximal ideal is in fact prime. In a [[principal ideal domain]] every nonzero prime ideal is maximal, but this is not true in general.
* If {{mvar|M}} is a smooth [[manifold]], {{mvar|R}} is the ring of smooth real functions on {{mvar|M}}, and {{mvar|x}} is a point in {{mvar|M}}, then the set of all smooth functions {{mvar|f}} with {{math|''f'' (''x'') {{=}} 0}} forms a prime ideal (even a maximal ideal) in {{mvar|R}}.

===Properties===
* An ideal {{math|''I''}} in the ring {{mvar|R}} (with unity) is prime if and only if the factor ring {{math|''R''/''I''}} is an [[integral domain]]. In particular, a commutative ring is an integral domain if and only if {{math|(0) }} is a prime ideal.
* An ideal {{math|''I''}} is prime if and only if its set-theoretic complement is [[multiplicatively closed set|multiplicatively closed]].<ref>{{cite book | last=Reid | first=Miles | authorlink=Miles Reid | title=Undergraduate Commutative Algebra | publisher=[[Cambridge University Press]] | year=1996 | isbn=0-521-45889-7}}</ref>
* Every nonzero ring contains at least one prime ideal (in fact it contains at least one maximal ideal), which is a direct consequence of [[Krull's theorem]].
* The set of all prime ideals (the spectrum of a ring) contains minimal elements (called [[minimal prime (commutative algebra)|minimal prime]]). Geometrically, these correspond to irreducible components of the spectrum.
* The [[preimage]] of a prime ideal under a ring homomorphism is a prime ideal.
* The sum of two prime ideals is not necessarily prime. For an example, consider the ring {{math|'''C'''[''x'', ''y'']}} with prime ideals {{math|''P'' {{=}} (''x''2 + ''y''2 − 1)}} and {{math|''Q'' {{=}} (''x'')}} (the ideals generated by {{math|''x''2 + ''y''2 − 1}} and ''x'' respectively). Their sum {{math|''P'' + ''Q'' {{=}} (''x''2 + ''y''2 − 1, ''x'') {{=}} (''y''2 − 1, ''x'')}} however is not prime: {{math|''y''2 − 1 {{=}} (''y'' − 1)(''y'' + 1) ∈ ''P'' + ''Q''}} but its two factors are not. Alternatively, note that the quotient ring has zero divisors so it is not an integral domain and thus {{math|''P'' + ''Q''}} cannot be prime.
* In a commutative ring {{mvar|R}} with at least two elements, if every proper ideal is prime, then the ring is a field. (If the ideal {{math|(0)}} is prime, then the ring {{mvar|R}} is an integral domain. If {{mvar|q}} is any non-zero element of {{mvar|R}} and the ideal {{math|(''q''2)}} is prime, then it contains {{mvar|q}} and then {{mvar|q}} is invertible.)
* A nonzero principal ideal is prime if and only if it is generated by a [[prime element]]. In a UFD, every nonzero prime ideal contains a prime element.

===Uses===
One use of prime ideals occurs in [[algebraic geometry]], where varieties are defined as the zero sets of ideals in polynomial rings. It turns out that the irreducible varieties correspond to prime ideals. In the modern abstract approach, one starts with an arbitrary commutative ring and turns the set of its prime ideals, also called its [[spectrum of a ring|spectrum]], into a [[topological space]] and can thus define generalizations of varieties called [[scheme (mathematics)|schemes]], which find applications not only in [[geometry]], but also in [[number theory]].

The introduction of prime ideals in [[algebraic number theory]] was a major step forward: it was realized that the important property of unique factorisation expressed in the [[fundamental theorem of arithmetic]] does not hold in every ring of [[algebraic integer]]s, but a substitute was found when [[Richard Dedekind]] replaced elements by ideals and prime elements by prime ideals; see [[Dedekind domain]].

==Prime ideals for noncommutative rings==
The notion of a prime ideal can be generalized to noncommutative rings by using the commutative definition "ideal-wise". [[Wolfgang Krull]] advanced this idea in 1928.<ref>Krull, Wolfgang, ''Primidealketten in allgemeinen Ringbereichen'', Sitzungsberichte Heidelberg. Akad. Wissenschaft (1928), 7. Abhandl.,3-14.</ref> The following content can be found in texts such as {{harv|Goodearl|2004}} and Lam's <ref>Lam ''First Course in Noncommutative Rings''</ref>. If {{mvar|R}} is a (possibly noncommutative) ring and {{mvar|P}} is an ideal in {{mvar|R}} other than {{mvar|R}} itself, we say that {{mvar|P}} is '''prime''' if for any two ideals {{mvar|A}} and {{mvar|B}} of {{mvar|R}}:

* If the product of ideals {{math|''AB''}} is contained in {{mvar|P}}, then at least one of {{mvar|A}} and {{mvar|B}} is contained in {{mvar|P}}.

It can be shown that this definition is equivalent to the commutative one in commutative rings. It is readily verified that if an ideal of a noncommutative ring {{mvar|R}} satisfies the commutative definition of prime, then it also satisfies the noncommutative version. An ideal {{mvar|P}} satisfying the commutative definition of prime is sometimes called a '''completely prime ideal''' to distinguish it from other merely prime ideals in the ring. Completely prime ideals are prime ideals, but the converse is not true. For example, the zero ideal in the ring of {{math|''n'' × ''n''}} matrices over a field is a prime ideal, but it is not completely prime.

This is close to the historical point of view of ideals as [[ideal number]]s, as for the ring {{math|'''Z'''}} "{{mvar|A}} is contained in {{mvar|P}}" is another way of saying "{{mvar|P}} divides {{mvar|A}}", and the unit ideal {{mvar|R}} represents unity.

Equivalent formulations of the ideal {{math|''P'' ≠ ''R''}} being prime include the following properties:
* For all {{mvar|a}} and {{mvar|b}} in {{mvar|R}}, {{math|(''a'')(''b'') ⊆ ''P''}} implies {{math|''a'' ∈ ''P''}} or {{math|''b'' ∈ ''P''}}.
* For any two ''right'' ideals of {{mvar|R}}, {{math|''AB'' ⊆ ''P''}} implies {{math|''A'' ⊆ ''P''}} or {{math|''B'' ⊆ ''P''}}.
* For any two ''left'' ideals of {{mvar|R}}, {{math|''AB'' ⊆ ''P''}} implies {{math|''A'' ⊆ ''P''}} or {{math|''B'' ⊆ ''P''}}.
* For any elements {{mvar|a}} and {{mvar|b}} of {{mvar|R}}, if {{math|''aRb'' ⊆ ''P''}}, then {{math|''a'' ∈ ''P''}} or {{math|''b'' ∈ ''P''}}.

Prime ideals in commutative rings are characterized by having [[multiplicatively closed subset|multiplicatively closed]] complements in {{mvar|R}}, and with slight modification, a similar characterization can be formulated for prime ideals in noncommutative rings. A nonempty subset {{math|''S'' ⊆ ''R''}} is called an '''m-system''' if for any {{mvar|a}} and {{mvar|b}} in {{mvar|S}}, there exists {{mvar|r}} in {{mvar|R}} such that ''arb'' is in {{mvar|S}}.<ref>Obviously, multiplicatively closed sets are m-systems.</ref> The following item can then be added to the list of equivalent conditions above:

* The complement {{math|''R''\''P''}} is an m-system.

===Examples===
* Any [[primitive ideal]] is prime.
* As with commutative rings, maximal ideals are prime, and also prime ideals contain minimal prime ideals.
* A ring is a [[prime ring]] if and only if the zero ideal is a prime ideal, and moreover a ring is a [[integral domain|domain]] if and only if the zero ideal is a completely prime ideal.
* Another fact from commutative theory echoed in noncommutative theory is that if {{mvar|A}} is a nonzero {{mvar|R}} module, and {{mvar|P}} is a maximal element in the [[poset]] of [[Annihilator (ring theory)|annihilator]] ideals of submodules of {{mvar|A}}, then {{mvar|P}} is prime.

==Important facts==
*'''[[Prime avoidance lemma]].''' If {{mvar|R}} is a commutative ring, and {{mvar|A}} is a subring (possibly without unity), and {{math|''I''1, ..., ''In''}} is a collection of ideals of {{mvar|R}} with at most two members not prime, then if {{mvar|A}} is not contained in any {{math|''Ij''}}, it is also not contained in the [[union (set theory)|union]] of {{math|''I''1, ..., ''In''}}.<ref>Jacobson ''Basic Algebra II'', p. 390</ref> In particular, {{mvar|A}} could be an ideal of {{mvar|R}}.
* If {{mvar|S}} is any m-system in {{mvar|R}}, then a lemma essentially due to Krull shows that there exists an ideal of {{mvar|R}} maximal with respect to being disjoint from {{mvar|S}}, and moreover the ideal must be prime.<ref>Lam ''First Course in Noncommutative Rings'', p. 156</ref> In the case {{math|{''S''} {{=}} {1},}} we have [[Krull's theorem]], and this recovers the maximal ideals of {{mvar|R}}. Another prototypical m-system is the set, {{math|{''x'', ''x''2, ''x''3, ''x''4, ...},}} of all positive powers of a non-[[nilpotent]] element.
* For a prime ideal {{mvar|P}}, the complement {{math|''R''\''P''}} has another property beyond being an m-system. If ''xy'' is in {{math|''R''\''P''}}, then both {{mvar|x}} and {{mvar|y}} must be in {{math|''R''\''P''}}, since {{mvar|P}} is an ideal. A set that contains the divisors of its elements is called '''saturated'''.
* For a commutative ring {{mvar|R}}, there is a kind of converse for the previous statement: If {{mvar|S}} is any nonempty saturated and multiplicatively closed subset of {{mvar|R}}, the complement {{math|''R''\''S''}} is a union of prime ideals of {{mvar|R}}.<ref>Kaplansky ''Commutative rings'', p. 2</ref>
*The intersection of members of a descending chain of prime ideals is a prime ideal, and in a commutative ring the union of members of an ascending chain of prime ideals is a prime ideal. With [[Zorn's Lemma]], these observations imply that the poset of prime ideals of a commutative ring (partially ordered by inclusion) has maximal and minimal elements.

==Connection to maximality==
Prime ideals can frequently be produced as maximal elements of certain collections of ideals. For example:
* An ideal maximal with respect to having empty intersection with a fixed m-system is prime.
* An ideal maximal among [[Annihilator (ring theory)|annihilators]] of submodules of a fixed {{mvar|R}} module {{mvar|M}} is prime.
* In a commutative ring, an ideal maximal with respect to being non-principal is prime.<ref>Kaplansky ''Commutative rings'', p. 10, Ex 10.</ref>
* In a commutative ring, an ideal maximal with respect to being not countably generated is prime.<ref>Kaplansky ''Commutative rings'', p. 10, Ex 11.</ref>

==References==
{{Reflist}}

==Further reading==

*{{citation|author1=Goodearl, K. R.|author2=Warfield, R. B., Jr.|title=An introduction to noncommutative Noetherian rings| series=London Mathematical Society Student Texts |volume=61 |edition=2 |publisher=Cambridge University Press | place=Cambridge |year=2004 |pages=xxiv+344|isbn=0-521-54537-4 |mr=2080008 }}
*{{citation|author=Jacobson, Nathan|title=Basic algebra. II|edition=2|publisher=W. H. Freeman and Company |place=New York|year=1989 |pages=xviii+686 |isbn=0-7167-1933-9 |mr=1009787}}
*{{citation|author=Kaplansky, Irving |title=Commutative rings |publisher=Allyn and Bacon Inc. |place=Boston, Mass.|year=1970 |pages=x+180 |mr=0254021 }}
*{{citation|author=Lam, T. Y.|authorlink=Tsit Yuen Lam |title=A first course in noncommutative rings |series=Graduate Texts in Mathematics|volume=131 |edition=2nd |publisher=Springer-Verlag |place=New York |year=2001 |pages=xx+385 |isbn=0-387-95183-0 |mr=1838439 | zbl=0980.16001 }}
*{{citation|author1=Lam, T. Y. | author1-link=Tsit Yuen Lam |author2=Reyes, Manuel L. |title=A prime ideal principle in commutative algebra |journal=J. Algebra |volume=319 |year=2008 |number=7 |pages=3006–3027 |issn=0021-8693 |mr=2397420 | zbl=1168.13002 |doi=10.1016/j.jalgebra.2007.07.016}}
* {{springer|title=Prime ideal|id=p/p074510}}

{{DEFAULTSORT:Prime Ideal}}
[[Category:Prime ideals| ]]

Adjoint functors

2016-01-21T13:44:16Z

Magmalex: Added an external link

{{About||the construction in field theory|Adjunction (field theory)|the construction in topology|Adjunction space}}

In [[mathematics]], specifically [[category theory]], '''adjunction''' is a possible relationship between two [[functor]]s.

Adjunction is ubiquitous in mathematics, as it specifies intuitive notions of optimization and efficiency.

In the most concise symmetric definition, an adjunction between categories ''C'' and ''D'' is a pair of functors,
:<math>F: \mathcal{D} \rightarrow \mathcal{C}</math>   and   <math>G: \mathcal{C} \rightarrow \mathcal{D}</math>
and a family of [[bijection]]s
:<math>\mathrm{hom}_{\mathcal{C}}(FY,X) \cong \mathrm{hom}_{\mathcal{D}}(Y,GX)</math>
which is [[natural transformation|natural]] in the variables ''X'' and ''Y''. The functor ''F'' is called a '''left adjoint functor''', while ''G'' is called a '''right adjoint functor'''. The relationship “''F'' is left adjoint to ''G''” (or equivalently, “''G'' is right adjoint to ''F''”) is sometimes written
:<math>F\dashv G.</math>

This definition and others are made precise below.

== Introduction ==

“The slogan is ‘Adjoint functors arise everywhere’.” (Saunders Mac Lane, ''Categories for the working mathematician'')

The [[Adjoint functors#Examples|long list of examples]] in this article is only a partial indication of how often an interesting mathematical construction is an adjoint functor. As a result, general theorems about left/right adjoint functors, such as the equivalence of their various definitions or the fact that they respectively preserve [[Limit (category theory)|colimits/limits]] (which are also found in every area of mathematics), can encode the details of many useful and otherwise non-trivial results.

=== Spelling (or [[morphology (linguistics)|morphology]]) ===

One can observe (e.g. in this article), two different [[root (linguistics)|roots]] are used: "adjunct" and "adjoint". From Oxford shorter English dictionary, "adjunct" is from Latin, "adjoint" is from French.

In Mac Lane, ''Categories for the working mathematician,'' chap. 4, "Adjoints", one can verify the following usage.

<math>\varphi: \mathrm{hom}_{\mathcal{C}}(FY,X) \cong \mathrm{hom}_{\mathcal{D}}(Y,GX)</math>

The hom-set bijection <math>\varphi</math> is an "adjunction".

If <math>f</math> an arrow in <math> \mathrm{hom}_{\mathcal{C}}(FY,X) </math>, <math>\varphi f</math> is the right "adjunct" of <math>f</math> (p. 81).

The functor <math> F </math> is left "adjoint" for <math>G</math>.

== Motivation ==

===Solutions to optimization problems===
It can be said that an adjoint functor is a way of giving the ''most efficient'' solution to some problem via a method which is ''formulaic''. For example, an elementary problem in [[ring theory]] is how to turn a [[Rng (algebra)|rng]] (which is like a ring that might not have a multiplicative identity) into a [[ring (mathematics)|ring]]. The ''most efficient'' way is to adjoin an element '1' to the rng, adjoin all (and only) the elements which are necessary for satisfying the ring axioms (e.g. ''r''+1 for each ''r'' in the ring), and impose no relations in the newly formed ring that are not forced by axioms. Moreover, this construction is ''formulaic'' in the sense that it works in essentially the same way for any rng.

This is rather vague, though suggestive, and can be made precise in the language of category theory: a construction is ''most efficient'' if it satisfies a [[universal property]], and is ''formulaic'' if it defines a [[functor]]. Universal properties come in two types: initial properties and terminal properties. Since these are [[dual (category theory)|dual]] (opposite) notions, it is only necessary to discuss one of them.

The idea of using an initial property is to set up the problem in terms of some auxiliary category ''E'', and then identify that what we want is to find an [[initial object]] of ''E''. This has an advantage that the ''optimization'' — the sense that we are finding the ''most efficient'' solution — means something rigorous and is recognisable, rather like the attainment of a [[supremum]]. The category ''E'' is also formulaic in this construction, since it is always the category of elements of the functor to which one is constructing an adjoint. In fact, this latter category is precisely the [[comma category]] over the functor in question.

As an example, take the given rng ''R'', and make a category ''E'' whose ''objects'' are rng homomorphisms ''R'' → ''S'', with ''S'' a ring having a multiplicative identity. The ''morphisms'' in ''E'' between ''R'' → ''S1'' and ''R'' → ''S2'' are [[commutative diagram|commutative triangles]] of the form (''R'' → ''S1'',''R'' → ''S2'', ''S1'' → ''S2'') where S1 → S2 is a ring map (which preserves the identity). Note that this is precisely the definition of the comma category of ''R'' over the inclusion of unitary rings into rng. The existence of a morphism between ''R'' → ''S1'' and ''R'' → ''S2'' implies that ''S1'' is at least as efficient a solution as ''S2'' to our problem: ''S2'' can have more adjoined elements and/or more relations not imposed by axioms than ''S1''.
Therefore, the assertion that an object ''R'' → ''R*'' is initial in ''E'', that is, that there is a morphism from it to any other element of ''E'', means that the ring ''R''* is a ''most efficient'' solution to our problem.

The two facts that this method of turning rngs into rings is ''most efficient'' and ''formulaic'' can be expressed simultaneously by saying that it defines an ''adjoint functor''.



===Symmetry of optimization problems===

Continuing this discussion, suppose we ''started'' with the functor ''F'', and posed the following (vague) question: is there a problem to which ''F'' is the most efficient solution?

The notion that ''F'' is the ''most efficient solution'' to the problem posed by ''G'' is, in a certain rigorous sense, equivalent to the notion that ''G'' poses the ''most difficult problem'' that ''F'' solves. {{Citation needed|date=March 2011}}

This has the intuitive meaning that adjoint functors should occur in pairs, and in fact they do, but this is not trivial from the universal morphism definitions. The equivalent symmetric definitions involving adjunctions and the symmetric language of adjoint functors (we can say either ''F'' is left adjoint to ''G'' or ''G'' is right adjoint to ''F'') have the advantage of making this fact explicit.

==Formal definitions==

There are various definitions for adjoint functors. Their equivalence is elementary but not at all trivial and in fact highly useful. This article provides several such definitions:

* The definitions via universal morphisms are easy to state, and require minimal verifications when constructing an adjoint functor or proving two functors are adjoint. They are also the most analogous to our intuition involving optimizations.
* The definition via counit-unit adjunction is convenient for proofs about functors which are known to be adjoint, because they provide formulas that can be directly manipulated.
* The definition via hom-sets makes symmetry the most apparent, and is the reason for using the word ''adjoint''.

Adjoint functors arise everywhere, in all areas of mathematics. Their full usefulness lies in that the structure in any of these definitions gives rise to the structures in the others via a long but trivial series of deductions. Thus, switching between them makes implicit use of a great deal of tedious details that would otherwise have to be repeated separately in every subject area. For example, naturality and terminality of the counit can be used to prove that any right adjoint functor preserves limits.

===Conventions===

The theory of adjoints has the terms ''left'' and ''right'' at its foundation, and there are many components which live in one of two categories ''C'' and ''D'' which are under consideration. It can therefore be extremely helpful to choose letters in alphabetical order according to whether they live in the "lefthand" category ''C'' or the "righthand" category ''D'', and also to write them down in this order whenever possible.

In this article for example, the letters ''X'', ''F'', ''f'', ε will consistently denote things which live in the category ''C'', the letters ''Y'', ''G'', ''g'', η will consistently denote things which live in the category ''D'', and whenever possible such things will be referred to in order from left to right (a functor ''F'':''C''←''D'' can be thought of as "living" where its outputs are, in ''C'').

===Universal morphisms===

A functor ''F'' : ''C'' ← ''D'' is a '''left adjoint functor''' if for each object ''X'' in ''C'', there exists a [[terminal morphism]] from ''F'' to ''X''. If, for each object ''X'' in ''C'', we choose an object ''G''0''X'' of ''D'' for which there is a terminal morphism ε''X'' : ''F''(''G''0''X'') → ''X'' from ''F'' to ''X'', then there is a unique functor ''G'' : ''C'' → ''D'' such that ''GX'' = ''G''0''X'' and ε''Xʹ'' ∘ ''FG''(''f'') = ''f'' ∘ ε''X'' for ''f'' : ''X'' → ''Xʹ'' a morphism in ''C''; ''F'' is then called a '''left adjoint to ''' ''G''.

A functor ''G'' : ''C'' → ''D'' is a '''right adjoint functor''' if for each object ''Y'' in ''D'', there exists an [[initial morphism]] from ''Y'' to ''G''. If, for each object ''Y'' in ''D'', we choose an object ''F''0''Y'' of ''C'' and an initial morphism η''Y'' : ''Y'' → ''G''(''F''0''Y'') from ''Y'' to ''G'', then there is a unique functor ''F'' : ''C'' ← ''D'' such that ''FY'' = ''F''0''Y'' and ''GF''(''g'') ∘ η''Y'' = η''Yʹ'' ∘ ''g'' for ''g'' : ''Y'' → ''Yʹ'' a morphism in ''D''; ''G'' is then called a '''right adjoint to ''' ''F''.

''' ''Remarks: '' '''

It is true, as the terminology implies, that ''F'' is ''left adjoint to G'' if and only if ''G'' is ''right adjoint to F''. This is apparent from the symmetric definitions given below. The definitions via universal morphisms are often useful for establishing that a given functor is left or right adjoint, because they are minimalistic in their requirements. They are also intuitively meaningful in that finding a universal morphism is like solving an optimization problem.

===Counit-unit adjunction===

A '''counit-unit adjunction''' between two categories ''C'' and ''D'' consists of two [[functor]]s ''F'' : ''C'' ← ''D'' and ''G'' : ''C'' → ''D'' and two [[natural transformation]]s
:<math>\begin{align}
\varepsilon &: FG \to 1_{\mathcal C} \\
\eta &: 1_{\mathcal D} \to GF\end{align}</math>
respectively called the '''counit''' and the '''unit''' of the adjunction (terminology from [[universal algebra]]), such that the compositions
:<math>F\xrightarrow{\;F\eta\;}FGF\xrightarrow{\;\varepsilon F\,}F</math>
:<math>G\xrightarrow{\;\eta G\;}GFG\xrightarrow{\;G \varepsilon\,}G</math>
are the identity transformations 1''F'' and 1''G'' on ''F'' and ''G'' respectively.

In this situation we say that ''' ''F'' is left adjoint to ''G'' ''' and ''' ''G'' is right adjoint to ''F'' ''', and may indicate this relationship by writing  <math>(\varepsilon,\eta):F\dashv G</math> , or simply  <math>F\dashv G</math> .

In equation form, the above conditions on (ε,η) are the '''counit-unit equations'''
:<math>\begin{align}
1_F &= \varepsilon F\circ F\eta\\
1_G &= G\varepsilon \circ \eta G
\end{align}</math>
which mean that for each ''X'' in ''C'' and each ''Y'' in ''D'',
:<math>\begin{align}
1_{FY} &= \varepsilon_{FY}\circ F(\eta_Y) \\
1_{GX} &= G(\varepsilon_X)\circ\eta_{GX}
\end{align}</math>.

Note that here <math>1</math> denotes identity functors, while above the same symbol was used for identity natural transformations.

These equations are useful in reducing proofs about adjoint functors to algebraic manipulations. They are sometimes called the ''zig-zag equations'' because of the appearance of the corresponding [[string diagram]]s. A way to remember them is to first write down the nonsensical equation <math>1=\varepsilon\circ\eta</math> and then fill in either ''F'' or ''G'' in one of the two simple ways which make the compositions defined.

Note: The use of the prefix "co" in counit here is not consistent with the terminology of limits and colimits, because a colimit satisfies an ''initial'' property whereas the counit morphisms will satisfy ''terminal'' properties, and dually. The term ''unit'' here is borrowed from the theory of [[Monad (category theory)|monads]] where it looks like the insertion of the identity 1 into a monoid.

===Hom-set adjunction===

A '''[[hom-set]] adjunction''' between two categories ''C'' and ''D'' consists of two [[functor]]s ''F'' : ''C'' ← ''D'' and ''G'' : ''C'' → ''D'' and a [[natural isomorphism]]
:<math>\Phi:\mathrm{hom}_C(F-,-) \to \mathrm{hom}_D(-,G-)</math>.
This specifies a family of bijections
:<math>\Phi_{Y,X}:\mathrm{hom}_C(FY,X) \to \mathrm{hom}_D(Y,GX)</math>.
for all objects ''X'' in ''C'' and ''Y'' in ''D''.

In this situation we say that ''' ''F'' is left adjoint to ''G'' ''' and ''' ''G'' is right adjoint to ''F'' ''', and may indicate this relationship by writing  <math>\Phi:F\dashv G</math> , or simply  <math>F\dashv G</math> .

This definition is a logical compromise in that it is somewhat more difficult to satisfy than the universal morphism definitions, and has fewer immediate implications than the counit-unit definition. It is useful because of its obvious symmetry, and as a stepping-stone between the other definitions.

In order to interpret Φ as a ''natural isomorphism'', one must recognize hom''C''(''F''–, –) and hom''D''(–, ''G''–) as functors. In fact, they are both [[bifunctor]]s from ''D''op × ''C'' to '''Set''' (the [[category of sets]]). For details, see the article on [[hom functor]]s. Explicitly, the naturality of Φ means that for all [[morphism]]s ''f'' : ''X'' → ''X′'' in ''C'' and all morphisms ''g'' : ''Y′ '' → ''Y'' in ''D'' the following diagram [[commutative diagram|commutes]]:

[[File:Natural phi.svg|center|Naturality of Φ|400px]]

The vertical arrows in this diagram are those induced by composition with ''f'' and ''g''. Formally, Hom(''Fg'', ''f'') : HomC(''FY'', ''X'') → HomC(''FY′'', ''X′'') is given by ''h'' → ''f o h o Fg'' for each ''h'' in HomC(''FY'', ''X''). Hom(''g'', ''Gf'') is similar.

==Adjunctions in full==

There are hence numerous functors and natural transformations associated with every adjunction, and only a small portion is sufficient to determine the rest.

An ''adjunction'' between categories ''C'' and ''D'' consists of
*A [[functor]] ''F'' : ''C'' ← ''D'' called the '''left adjoint'''
*A functor ''G'' : ''C'' → ''D'' called the '''right adjoint'''
*A [[natural isomorphism]] Φ : hom''C''(''F''–,–) → hom''D''(–,''G''–)
*A [[natural transformation]] ε : ''FG'' → 1''C'' called the '''counit'''
*A natural transformation η : 1''D'' → ''GF'' called the '''unit'''

An equivalent formulation, where ''X'' denotes any object of ''C'' and ''Y'' denotes any object of ''D'':

''For every ''C''-morphism ''f'' : ''FY'' → ''X'', there is a unique ''D''-morphism Φ''Y'', ''X''(''f'') = ''g'' : ''Y'' → ''GX'' such that the diagrams below commute, and for every ''D''-morphism ''g'' : ''Y'' → ''GX'', there is a unique ''C''-morphism Φ−1''Y'', ''X''(''g'') = ''f'' : ''FY'' → ''X'' in ''C'' such that the diagrams below commute:''

[[File:Adjoint functors sym.svg|center|350px]]

From this assertion, one can recover that:
*The transformations ε, η, and Φ are related by the equations
:<math>\begin{align}
f = \Phi_{Y,X}^{-1}(g) &= \varepsilon_X\circ F(g) & \in & \, \, \mathrm{hom}_C(F(Y),X)\\
g = \Phi_{Y,X}(f) &= G(f)\circ \eta_Y & \in & \, \, \mathrm{hom}_D(Y,G(X))\\
\Phi_{GX,X}^{-1}(1_{GX}) &= \varepsilon_X & \in & \, \, \mathrm{hom}_C(FG(X),X)\\
\Phi_{Y,FY}(1_{FY}) &= \eta_Y & \in & \, \, \mathrm{hom}_D(Y,GF(Y))\\
\end{align}
</math>
*The transformations ε, η satisfy the counit-unit equations
:<math>\begin{align}
1_{FY} &= \varepsilon_{FY} \circ F(\eta_Y)\\
1_{GX} &= G(\varepsilon_X) \circ \eta_{GX}
\end{align}</math>
*Each pair (''GX'', ε''X'') is a [[universal morphism|terminal morphism]] from ''F'' to ''X'' in ''C''
*Each pair (''FY'', η''Y'') is an [[universal morphism|initial morphism]] from ''Y'' to ''G'' in ''D''

In particular, the equations above allow one to define Φ, ε, and η in terms of any one of the three. However, the adjoint functors ''F'' and ''G'' alone are in general not sufficient to determine the adjunction. We will demonstrate the equivalence of these situations below.

===Universal morphisms induce hom-set adjunction===

Given a right adjoint functor ''G'' : ''C'' → ''D''; in the sense of initial morphisms, one may construct the induced hom-set adjunction by doing the following steps.

* Construct a functor ''F'' : ''C'' ← ''D'' and a natural transformation η.
** For each object ''Y'' in ''D'', choose an initial morphism (''F''(''Y''), η''Y'') from ''Y'' to ''G'', so we have η''Y'' : ''Y'' → ''G''(''F''(''Y'')). We have the map of ''F'' on objects and the family of morphisms η.
** For each ''f'' : ''Y''0 → ''Y''1, as (''F''(''Y''0), η''Y''0) is an initial morphism, then factorize η''Y''1 o ''f'' with η''Y''0 and get ''F''(''f'') : ''F''(''Y''0) → ''F''(''Y''1). This is the map of ''F'' on morphisms.
** The commuting diagram of that factorization implies the commuting diagram of natural transformations, so η : 1''D'' → ''G'' o ''F'' is a [[natural transformation]].
** Uniqueness of that factorization and that ''G'' is a functor implies that the map of ''F'' on morphisms preserves compositions and identities.
* Construct a natural isomorphism Φ : hom''C''(''F''-,-) → hom''D''(-,''G''-).
** For each object ''X'' in ''C'', each object ''Y'' in ''D'', as (''F''(''Y''), η''Y'') is an initial morphism, then Φ''Y'', ''X'' is a bijection, where Φ''Y'', ''X''(''f'' : ''F''(''Y'') → ''X'') = ''G''(''f'') o η''Y''.
** η is a natural transformation, ''G'' is a functor, then for any objects ''X''0, ''X''1 in ''C'', any objects ''Y''0, ''Y''1 in ''D'', any ''x'' : ''X''0 → ''X''1, any ''y'' : ''Y''1 → ''Y''0, we have Φ''Y''1, ''X''1(''x'' o ''f'' o ''F''(''y'')) = G(x) o ''G''(''f'') o ''G''(''F''(''y'')) o η''Y''1 = ''G''(''x'') o ''G''(''f'') o η''Y''0 o ''y'' = ''G''(''x'') o Φ''Y''0, ''X''0(''f'') o ''y'', and then Φ is natural in both arguments.

A similar argument allows one to construct a hom-set adjunction from the terminal morphisms to a left adjoint functor. (The construction that starts with a right adjoint is slightly more common, since the right adjoint in many adjoint pairs is a trivially defined inclusion or forgetful functor.)

===Counit-unit adjunction induces hom-set adjunction===

Given functors ''F'' : ''C'' ← ''D'', ''G'' : ''C'' → ''D'', and a counit-unit adjunction (ε, η) : ''F'' <math>\dashv</math> ''G'', we can construct a hom-set adjunction by finding the natural transformation Φ : hom''C''(''F''-,-) → hom''D''(-,''G''-) in the following steps:

*For each ''f'' : ''FY'' → ''X'' and each ''g'' : ''Y'' → ''GX'', define
:<math>\begin{align}\Phi_{Y,X}(f) = G(f)\circ \eta_Y\\
\Psi_{Y,X}(g) = \varepsilon_X\circ F(g)\end{align}</math>
:The transformations Φ and Ψ are natural because η and ε are natural.

*Using, in order, that ''F'' is a functor, that ε is natural, and the counit-unit equation 1''FY'' = ε''FY'' o ''F''(η''Y''), we obtain
:<math>\begin{align}
\Psi\Phi f &= \varepsilon_X\circ FG(f)\circ F(\eta_Y) \\
&= f\circ \varepsilon_{FY}\circ F(\eta_Y) \\
&= f\circ 1_{FY} = f\end{align}</math>
:hence ΨΦ is the identity transformation.

*Dually, using that ''G'' is a functor, that η is natural, and the counit-unit equation 1''GX'' = ''G''(ε''X'') o η''GX'', we obtain
:<math>\begin{align}
\Phi\Psi g &= G(\varepsilon_X)\circ GF(g)\circ\eta_Y \\
&= G(\varepsilon_X)\circ\eta_{GX}\circ g \\
&= 1_{GX}\circ g = g\end{align}</math>
:hence ΦΨ is the identity transformation. Thus Φ is a natural isomorphism with inverse Φ−1 = Ψ.

===Hom-set adjunction induces all of the above===

Given functors ''F'' : ''C'' ← ''D'', ''G'' : ''C'' → ''D'', and a hom-set adjunction Φ : hom''C''(''F''-,-) → hom''D''(-,''G''-), we can construct a counit-unit adjunction

:<math>(\varepsilon,\eta):F\dashv G</math> ,

which defines families of initial and terminal morphisms, in the following steps:

*Let  <math>\varepsilon_X=\Phi_{GX,X}^{-1}(1_{GX})\in\mathrm{hom}_C(FGX,X)</math>  for each ''X'' in ''C'', where  <math>1_{GX}\in\mathrm{hom}_D(GX,GX)</math>  is the identity morphism.

*Let  <math>\eta_Y=\Phi_{Y,FY}(1_{FY})\in\mathrm{hom}_D(Y,GFY)</math>  for each ''Y'' in ''D'', where  <math>1_{FY}\in\mathrm{hom}_C(FY,FY)</math>  is the identity morphism.

*The bijectivity and naturality of Φ imply that each (''GX'', ε''X'') is a terminal morphism from ''F'' to ''X'' in ''C'', and each (''FY'', η''Y'') is an initial morphism from ''Y'' to ''G'' in ''D''.

*The naturality of Φ implies the naturality of ε and η, and the two formulas
:<math>\begin{align}\Phi_{Y,X}(f) = G(f)\circ \eta_Y\\
\Phi_{Y,X}^{-1}(g) = \varepsilon_X\circ F(g)\end{align}</math>
:for each ''f'': ''FY'' → ''X'' and ''g'': ''Y'' → ''GX'' (which completely determine Φ).

*Substituting ''FY'' for ''X'' and η''Y'' = Φ''Y'', ''FY''(1''FY'') for ''g'' in the second formula gives the first counit-unit equation
:<math>1_{FY} = \varepsilon_{FY}\circ F(\eta_Y)</math>,
:and substituting ''GX'' for ''Y'' and εX = Φ−1''GX, X''(1''GX'') for ''f'' in the first formula gives the second counit-unit equation
:<math>1_{GX} = G(\varepsilon_X)\circ\eta_{GX}</math>.

==History==

===Ubiquity===

The idea of an adjoint functor was formulated by [[Daniel Kan]] in 1958. Like many of the concepts in category theory, it was suggested by the needs of [[homological algebra]], which was at the time devoted to computations. Those faced with giving tidy, systematic presentations of the subject would have noticed relations such as

:hom(''F''(''X''), ''Y'') = hom(''X'', ''G''(''Y''))

in the category of [[abelian group]]s, where ''F'' was the functor <math>- \otimes A</math> (i.e. take the [[tensor product]] with ''A''), and ''G'' was the functor hom(''A'',–).

The use of the ''equals'' sign is an [[abuse of notation]]; those two groups are not really identical but there is a way of identifying them that is ''natural''. It can be seen to be natural on the basis, firstly, that these are two alternative descriptions of the [[bilinear mapping]]s from ''X'' × ''A'' to ''Y''. That is, however, something particular to the case of tensor product. In category theory the 'naturality' of the bijection is subsumed in the concept of a [[natural isomorphism]].

The terminology comes from the [[Hilbert space]] idea of [[adjoint operator]]s ''T'', ''U'' with <math>\langle Tx,y\rangle = \langle x,Uy\rangle</math>, which is formally similar to the above relation between hom-sets. We say that ''F'' is ''left adjoint'' to ''G'', and ''G'' is ''right adjoint'' to ''F''. Note that ''G'' may have itself a right adjoint that is quite different from ''F'' (see below for an example). The analogy to adjoint maps of Hilbert spaces can be made precise in certain contexts.<ref>[http://www.arxiv.org/abs/q-alg/9609018 arXiv.org: John C. Baez ''Higher-Dimensional Algebra II: 2-Hilbert Spaces''].</ref>

If one starts looking for these adjoint pairs of functors, they turn out to be very common in [[abstract algebra]], and elsewhere as well. The example section below provides evidence of this; furthermore, [[universal construction]]s, which may be more familiar to some, give rise to numerous adjoint pairs of functors.

In accordance with the thinking of [[Saunders Mac Lane]], any idea, such as adjoint functors, that occurs widely enough in mathematics should be studied for its own sake.{{Citation needed|date=November 2007}}

===Problems formulations===

Mathematicians do not generally need the full adjoint functor concept. Concepts can be judged according to their use in solving problems, as well as for their use in building theories. The tension between these two motivations was especially great during the 1950s when category theory was initially developed. Enter [[Alexander Grothendieck]], who used category theory to take compass bearings in other work — in [[functional analysis]], [[homological algebra]] and finally [[algebraic geometry]].

It is probably wrong to say that he promoted the adjoint functor concept in isolation: but recognition of the role of adjunction was inherent in Grothendieck's approach. For example, one of his major achievements was the formulation of [[Serre duality]] in relative form — loosely, in a continuous family of algebraic varieties. The entire proof turned on the existence of a right adjoint to a certain functor. This is something undeniably abstract, and non-constructive, but also powerful in its own way.

===Posets===

Every [[partially ordered set]] can be viewed as a category (with a single morphism between ''x'' and ''y'' if and only if ''x'' ≤ ''y''). A pair of adjoint functors between two partially ordered sets is called a [[Galois connection]] (or, if it is contravariant, an ''antitone'' Galois connection). See that article for a number of examples: the case of [[Galois theory]] of course is a leading one. Any Galois connection gives rise to [[closure operator]]s and to inverse order-preserving bijections between the corresponding closed elements.

As is the case for Galois groups, the real interest lies often in refining a correspondence to a [[Duality (mathematics)|duality]] (i.e. ''antitone'' order isomorphism). A treatment of Galois theory along these lines by [[Irving Kaplansky|Kaplansky]] was influential in the recognition of the general structure here.

The partial order case collapses the adjunction definitions quite noticeably, but can provide several themes:
* adjunctions may not be dualities or isomorphisms, but are candidates for upgrading to that status
* closure operators may indicate the presence of adjunctions, as corresponding [[monad (category theory)|monads]] (cf. the [[Kuratowski closure axioms]])
* a very general comment of [[William Lawvere]]<ref>[[William Lawvere]], Adjointness in foundations, Dialectica, 1969, [http://www.tac.mta.ca/tac/reprints/articles/16/tr16abs.html available here]. The notation is different nowadays; an easier introduction by Peter Smith [http://www.logicmatters.net/resources/pdfs/Galois.pdf in these lecture notes], which also attribute the concept to the article cited.</ref> is that ''syntax and semantics'' are adjoint: take ''C'' to be the set of all logical theories (axiomatizations), and ''D'' the power set of the set of all mathematical structures. For a theory ''T'' in ''C'', let ''F''(''T'') be the set of all structures that satisfy the axioms ''T''; for a set of mathematical structures ''S'', let ''G''(''S'') be the minimal axiomatization of ''S''. We can then say that ''F''(''T'') is a subset of ''S'' if and only if ''T'' logically implies ''G''(''S''): the "semantics functor" ''F'' is left adjoint to the "syntax functor" ''G''.
* [[division (mathematics)|division]] is (in general) the attempt to ''invert'' multiplication, but many examples, such as the introduction of [[material conditional|implication]] in [[propositional calculus|propositional logic]], or the [[ideal quotient]] for division by [[ring ideal]]s, can be recognised as the attempt to provide an adjoint.

Together these observations provide explanatory value all over mathematics.

==Examples ==


===Free groups===

The construction of [[free group]]s is a common and illuminating example.

Suppose that ''F'' : '''[[category of groups|Grp]]''' ← '''[[category of sets|Set]]''' is the functor assigning to each set ''Y'' the [[free group]] generated by the elements of ''Y'', and that ''G'' : '''Grp''' → '''Set''' is the [[forgetful functor]], which assigns to each group ''X'' its underlying set. Then ''F'' is left adjoint to ''G'':

'''Terminal morphisms.''' For each group ''X'', the group ''FGX'' is the free group generated freely by ''GX'', the elements of ''X''. Let  <math>\varepsilon_X:FGX\to X</math>  be the group homomorphism which sends the generators of ''FGX'' to the elements of ''X'' they correspond to, which exists by the universal property of free groups. Then each  <math>(GX,\varepsilon_X)</math>  is a terminal morphism from ''F'' to ''X'', because any group homomorphism from a free group ''FZ'' to ''X'' will factor through  <math>\varepsilon_X:FGX\to X</math>  via a unique set map from ''Z'' to ''GX''. This means that (''F'',''G'') is an adjoint pair.

'''Initial morphisms.''' For each set ''Y'', the set ''GFY'' is just the underlying set of the free group ''FY'' generated by ''Y''. Let  <math>\eta_Y:Y\to GFY</math>  be the set map given by "inclusion of generators". Then each  <math>(FY,\eta_Y)</math>  is an initial morphism from ''Y'' to ''G'', because any set map from ''Y'' to the underlying set ''GW'' of a group will factor through  <math>\eta_Y:Y\to GFY</math>  via a unique group homomorphism from ''FY'' to ''W''. This also means that (''F'',''G'') is an adjoint pair.

'''Hom-set adjunction.''' Maps from the free group ''FY'' to a group ''X'' correspond precisely to maps from the set ''Y'' to the set ''GX'': each homomorphism from ''FY'' to ''X'' is fully determined by its action on generators. One can verify directly that this correspondence is a natural transformation, which means it is a hom-set adjunction for the pair (''F'',''G'').

'''Counit-unit adjunction.''' One can also verify directly that ε and η are natural. Then, a direct verification that they form a counit-unit adjunction  <math>(\varepsilon,\eta):F\dashv G</math>  is as follows:

'''The first counit-unit equation'''  <math>1_F = \varepsilon F\circ F\eta</math>  says that for each set ''Y'' the composition
:<math>FY\xrightarrow{\;F(\eta_Y)\;}FGFY\xrightarrow{\;\varepsilon_{FY}\,}FY</math>
should be the identity. The intermediate group ''FGFY'' is the free group generated freely by the words of the free group ''FY''. (Think of these words as placed in parentheses to indicate that they are independent generators.) The arrow  <math>F(\eta_Y)</math>  is the group homomorphism from ''FY'' into ''FGFY'' sending each generator ''y'' of ''FY'' to the corresponding word of length one (''y'') as a generator of ''FGFY''. The arrow  <math>\varepsilon_{FY}</math>  is the group homomorphism from ''FGFY'' to ''FY'' sending each generator to the word of ''FY'' it corresponds to (so this map is "dropping parentheses"). The composition of these maps is indeed the identity on ''FY''.

'''The second counit-unit equation'''  <math>1_G = G\varepsilon \circ \eta G</math>  says that for each group ''X'' the composition
: <math>GX\xrightarrow{\;\eta_{GX}\;}GFGX\xrightarrow{\;G(\varepsilon_X)\,}GX</math> 
should be the identity. The intermediate set ''GFGX'' is just the underlying set of ''FGX''. The arrow  <math>\eta_{GX}</math>  is the "inclusion of generators" set map from the set ''GX'' to the set ''GFGX''. The arrow  <math>G(\varepsilon_X)</math>  is the set map from ''GFGX'' to ''GX'' which underlies the group homomorphism sending each generator of ''FGX'' to the element of ''X'' it corresponds to ("dropping parentheses"). The composition of these maps is indeed the identity on ''GX''.

===Free constructions and forgetful functors===
[[Free object]]s are all examples of a left adjoint to a [[forgetful functor]] which assigns to an algebraic object its underlying set. These algebraic [[free functor]]s have generally the same description as in the detailed description of the free group situation above.

===Diagonal functors and limits===
[[Product (category theory)|Products]], [[Pullback (category theory)|fibred products]], [[Equalizer (mathematics)|equalizers]], and [[Kernel (algebra)|kernels]] are all examples of the categorical notion of a [[limit (category theory)|limit]]. Any limit functor is right adjoint to a corresponding diagonal functor (provided the category has the type of limits in question), and the counit of the adjunction provides the defining maps from the limit object (i.e. from the diagonal functor on the limit, in the functor category). Below are some specific examples.

* '''Products''' Let Π : '''Grp2''' → '''Grp''' the functor which assigns to each pair (''X''1, ''X2'') the product group ''X''1×''X''2, and let Δ : '''Grp2''' ← '''Grp''' be the [[diagonal functor]] which assigns to every group ''X'' the pair (''X'', ''X'') in the product category '''Grp2'''. The universal property of the product group shows that Π is right-adjoint to Δ. The counit of this adjunction is the defining pair of projection maps from ''X''1×''X''2 to ''X''1 and ''X''2 which define the limit, and the unit is the ''diagonal inclusion'' of a group X into ''X''1×''X''2 (mapping x to (x,x)).

: The [[cartesian product]] of [[Set (mathematics)|sets]], the product of rings, the [[product topology|product of topological spaces]] etc. follow the same pattern; it can also be extended in a straightforward manner to more than just two factors. More generally, any type of limit is right adjoint to a diagonal functor.

* '''Kernels.''' Consider the category ''D'' of homomorphisms of abelian groups. If ''f''1 : ''A''1 → ''B''1 and ''f''2 : ''A''2 → ''B''2 are two objects of ''D'', then a morphism from ''f''1 to ''f''2 is a pair (''g''''A'', ''g''''B'') of morphisms such that ''gBf''1 = ''f''2''gA''. Let ''G'' : ''D'' → '''Ab''' be the functor which assigns to each homomorphism its [[kernel (algebra)|kernel]] and let ''F'' : '''D''' ← '''Ab''' be the functor which maps the group ''A'' to the homomorphism ''A'' → 0. Then ''G'' is right adjoint to ''F'', which expresses the universal property of kernels. The counit of this adjunction is the defining embedding of a homomorphism's kernel into the homomorphism's domain, and the unit is the morphism identifying a group ''A'' with the kernel of the homomorphism ''A'' → 0.

: A suitable variation of this example also shows that the kernel functors for vector spaces and for modules are right adjoints. Analogously, one can show that the cokernel functors for abelian groups, vector spaces and modules are left adjoints.

===Colimits and diagonal functors===
[[Coproduct]]s, [[Pushout (category theory)|fibred coproducts]], [[coequalizer]]s, and [[cokernel]]s are all examples of the categorical notion of a [[limit (category theory)|colimit]]. Any colimit functor is left adjoint to a corresponding diagonal functor (provided the category has the type of colimits in question), and the unit of the adjunction provides the defining maps into the colimit object. Below are some specific examples.

* '''Coproducts.''' If ''F'' : '''[[category of abelian groups|Ab]]''' ← '''Ab2''' assigns to every pair (''X''1, ''X''2) of abelian groups their [[Direct sum of groups|direct sum]], and if ''G'' : '''Ab''' → '''Ab2''' is the functor which assigns to every abelian group ''Y'' the pair (''Y'', ''Y''), then ''F'' is left adjoint to ''G'', again a consequence of the universal property of direct sums. The unit of this adjoint pair is the defining pair of inclusion maps from ''X''1 and ''X''2 into the direct sum, and the counit is the additive map from the direct sum of (''X'',''X'') to back to ''X'' (sending an element (''a'',''b'') of the direct sum to the element ''a''+''b'' of ''X'').

: Analogous examples are given by the [[Direct sum of modules|direct sum]] of [[vector space]]s and [[module (mathematics)|modules]], by the [[free product]] of groups and by the disjoint union of sets.

===Further examples===

==== Algebra ====
* '''Adjoining an identity to a [[Rng (algebra)|rng]].''' This example was discussed in the motivation section above. Given a rng ''R'', a multiplicative identity element can be added by taking ''R''x'''Z''' and defining a '''Z'''-bilinear product with (r,0)(0,1) = (0,1)(r,0) = (r,0), (r,0)(s,0) = (rs,0), (0,1)(0,1) = (0,1). This constructs a left adjoint to the functor taking a ring to the underlying rng.

* '''Ring extensions.''' Suppose ''R'' and ''S'' are rings, and ρ : ''R'' → ''S'' is a [[ring homomorphism]]. Then ''S'' can be seen as a (left) ''R''-module, and the [[tensor product]] with ''S'' yields a functor ''F'' : ''R''-'''Mod''' → ''S''-'''Mod'''. Then ''F'' is left adjoint to the forgetful functor ''G'' : ''S''-'''Mod''' → ''R''-'''Mod'''.

* '''[[Tensor-hom adjunction|Tensor products]].''' If ''R'' is a ring and ''M'' is a right ''R'' module, then the tensor product with ''M'' yields a functor ''F'' : ''R''-'''Mod''' → '''Ab'''. The functor ''G'' : '''Ab''' → ''R''-'''Mod''', defined by ''G''(''A'') = hom'''Z'''(''M'',''A'') for every abelian group ''A'', is a right adjoint to ''F''.

* '''From monoids and groups to rings''' The [[integral monoid ring]] construction gives a functor from [[monoid]]s to rings. This functor is left adjoint to the functor that associates to a given ring its underlying multiplicative monoid. Similarly, the [[integral group ring]] construction yields a functor from [[group (mathematics)|groups]] to rings, left adjoint to the functor that assigns to a given ring its [[group of units]]. One can also start with a [[field (mathematics)|field]] ''K'' and consider the category of ''K''-[[associative algebra|algebras]] instead of the category of rings, to get the monoid and group rings over ''K''.

* '''Field of fractions.''' Consider the category '''Dom'''m of integral domains with injective morphisms. The forgetful functor '''Field''' → '''Dom'''m from fields has a left adjoint - it assigns to every integral domain its [[field of fractions]].

* '''Polynomial rings'''. Let '''Ring'''* be the category of pointed commutative rings with unity (pairs (A,a) where A is a ring, <math>a \in A</math> and morphisms preserve the distinguished elements). The forgetful functor G:'''Ring'''* → '''Ring''' has a left adjoint - it assigns to every ring R the pair (R[x],x) where R[x] is the [[polynomial ring]] with coefficients from R.

* '''Abelianization'''. Consider the inclusion functor ''G'' : '''Ab''' → '''Grp''' from the [[category of abelian groups]] to [[category of groups]]. It has a left adjoint called [[abelianization]] which assigns to every group ''G'' the quotient group ''G''ab=''G''/[''G'',''G''].

* '''The Grothendieck group'''. In [[K-theory]], the point of departure is to observe that the category of [[vector bundle]]s on a [[topological space]] has a commutative monoid structure under [[Direct sum of modules|direct sum]]. One may make an [[abelian group]] out of this monoid, the [[Grothendieck group]], by formally adding an additive inverse for each bundle (or equivalence class). Alternatively one can observe that the functor that for each group takes the underlying monoid (ignoring inverses) has a left adjoint. This is a once-for-all construction, in line with the third section discussion above. That is, one can imitate the construction of [[negative number]]s; but there is the other option of an [[existence theorem]]. For the case of finitary algebraic structures, the existence by itself can be referred to [[universal algebra]], or [[model theory]]; naturally there is also a proof adapted to category theory, too.

* '''Frobenius reciprocity''' in the [[group representation|representation theory of groups]]: see [[induced representation]]. This example foreshadowed the general theory by about half a century.

====Topology====
* '''A functor with a left and a right adjoint.''' Let ''G'' be the functor from [[topological space]]s to [[Set (mathematics)|sets]] that associates to every topological space its underlying set (forgetting the topology, that is). ''G'' has a left adjoint ''F'', creating the [[discrete space]] on a set ''Y'', and a right adjoint ''H'' creating the [[trivial topology]] on ''Y''.
* '''Suspensions and loop spaces''' Given [[topological spaces]] ''X'' and ''Y'', the space [''SX'', ''Y''] of [[homotopy classes]] of maps from the [[suspension (topology)|suspension]] ''SX'' of ''X'' to ''Y'' is naturally isomorphic to the space [''X'', Ω''Y''] of homotopy classes of maps from ''X'' to the [[loop space]] Ω''Y'' of ''Y''. This is an important fact in [[homotopy theory]].
* '''Stone-Čech compactification.''' Let '''KHaus''' be the category of [[compact space|compact]] [[Hausdorff space]]s and ''G'' : '''KHaus''' → '''Top''' be the inclusion functor to the category of [[topological spaces]]. Then ''G'' has a left adjoint ''F'' : '''Top''' → '''KHaus''', the [[Stone–Čech compactification]]. The unit of this adjoint pair yields a [[continuous function (topology)|continuous]] map from every topological space ''X'' into its Stone-Čech compactification. This map is an [[embedding]] (i.e. [[injective]], continuous and open) if and only if ''X'' is a [[Tychonoff space]].
* '''Direct and inverse images of sheaves''' Every [[continuous map]] ''f'' : ''X'' → ''Y'' between [[topological space]]s induces a functor ''f'' ∗ from the category of [[sheaf (mathematics)|sheaves]] (of sets, or abelian groups, or rings...) on ''X'' to the corresponding category of sheaves on ''Y'', the ''[[direct image functor]]''. It also induces a functor ''f'' −1 from the category of sheaves of abelian groups on ''Y'' to the category of sheaves of abelian groups on ''X'', the ''[[inverse image functor]]''. ''f'' −1 is left adjoint to ''f'' ∗. Here a more subtle point is that the left adjoint for [[coherent sheaf|coherent sheaves]] will differ from that for sheaves (of sets).
* '''Soberification.''' The article on [[Stone duality]] describes an adjunction between the category of topological spaces and the category of [[sober space]]s that is known as soberification. Notably, the article also contains a detailed description of another adjunction that prepares the way for the famous [[duality (category theory)|duality]] of sober spaces and spatial locales, exploited in [[pointless topology]].

====Category theory====
* '''A series of adjunctions.''' The functor π0 which assigns to a category its set of connected components is left-adjoint to the functor ''D'' which assigns to a set the discrete category on that set. Moreover, ''D'' is left-adjoint to the object functor ''U'' which assigns to each category its set of objects, and finally ''U'' is left-adjoint to ''A'' which assigns to each set the [http://ncatlab.org/nlab/show/indiscrete+category indiscrete category] on that set.
* '''Exponential object'''. In a [[cartesian closed category]] the endofunctor ''C'' → ''C'' given by –×''A'' has a right adjoint –''A''.


====Categorical logic====
* '''Quantification.''' If <math>\phi_Y</math> is a unary predicate expressing some property, then a sufficiently strong set theory may prove the existence of the set <math>Y=\{y\mid\phi_Y(y)\}</math> of terms that fulfill the property. A proper subset <math>T\subset Y</math> and the associated injection of <math>T</math> into <math>Y</math> is characterized by a predicate <math>\phi_T(y)=\phi_Y(y)\land\varphi(y)</math> expressing a strictly more restrictive property.
:The role of [[Quantifier (logic)|quantifiers]] in predicate logics is in forming propositions and also in expressing sophisticated predicates by closing formulas with possibly more variables. For example, consider a predicate <math>\psi_f</math> with two open variables of sort <math>X</math> and <math>Y</math>. Using a quantifier to close <math>X</math>, we can form the set
::<math>\{y\in Y\mid \exists x.\,\psi_f(x,y)\land\phi_{S}(x)\}</math>
:of all elements <math>y</math> of <math>Y</math> for which there is an <math>x</math> to which it is <math>\psi_f</math>-related, and which itself is characterized by the property <math>\phi_{S}</math>. Set theoretic operations like the intersection <math>\cap</math> of two sets directly corresponds to the conjunction <math>\land</math> of predicates. In [[categorical logic]], a subfield of [[topos theory]], quantifiers are identified with adjoints to the pullback functor. Such a realization can be seen in analogy to the discussion of propositional logic using set theory but, interestingly, the general definition make for a richer range of logics.

:So consider an object <math>Y</math> in a category with pullbacks. Any morphism <math>f:X\to Y</math> induces a functor
::<math>f^{*} : \text{Sub}(Y) \longrightarrow \text{Sub}(X)</math>
:on the category that is the preorder of subobjects. It maps subobjects <math>T</math> of <math>Y</math> (technically: monomorphism classes of <math>T\to Y</math>) to the pullback <math>X\times_Y T</math>. If this functor has a left- or right adjoint, they are called <math>\exists_f</math> and <math>\forall_f</math>, respectively.<ref>Saunders Mac Lane, Ieke Moerdijk, (1992) ''Sheaves in Geometry and Logic'' Springer-Verlag. ISBN 0-387-97710-4 ''See page 58''</ref> They both map from <math>\text{Sub}(X)</math> back to <math>\text{Sub}(Y)</math>. Very roughly, given a domain <math>S\subset X</math> to quantify a relation expressed via <math>f</math> over, the functor/quantifier closes <math>X</math> in <math>X\times_Y T</math> and returns the thereby specified subset of <math>Y</math>.

: '''Example''': In <math>\operatorname{Set}</math>, the category of sets and functions, the canonical subobjects are the subset (or rather their canonical injections). The pullback <math>f^{*}T=X\times_Y T</math> of an injection of a subset <math>T</math> into <math>Y</math> along <math>f</math> is characterized as the largest set which knows all about <math>f</math> and the injection of <math>T</math> into <math>Y</math>. It therefore turns out to be (in bijection with) the inverse image <math>f^{-1}[T]\subseteq X</math>.
:For <math>S \subseteq X</math>, let us figure out the left adjoint, which is defined via
::<math>{\operatorname{Hom}}(\exists_f S,T)
\cong
{\operatorname{Hom}}(S,f^{*}T),</math>
:which here just means
::<math>\exists_f S\subseteq T
\leftrightarrow
S\subseteq f^{-1}[T]</math>.

:Consider <math> f[S] \subseteq T </math>. We see <math>S\subseteq f^{-1}[f[S]]\subseteq f^{-1}[T]</math>. Conversely, If for an <math>x\in S</math> we also have <math>x\in f^{-1}[T]</math>, then clearly <math> f(x)\in T </math>. So <math> S \subseteq f^{-1}[T] </math> implies <math> f[S] \subseteq T </math>. We concude that left adjoint to the inverse image functor <math>f^{*}</math> is given by the direct image. Here is a characterization of this result, which matches more the logical interpretation: The image of <math>S</math> under <math>\exists_f </math> is the full set of <math>y</math>'s, such that <math> f^{-1} [\{y\}] \cap S</math> is non-empty. This works because it neglects exactly those <math>y\in Y</math> which are in the complement of <math>f[S]</math>. So
::<math>
\exists_f S
= \{ y \in Y \mid \exists (x \in f^{-1}[\{y\}]).\, x \in S \; \}
= f[S].
</math>
:Put this in analogy to our motivation <math>\{y\in Y\mid\exists x.\,\psi_f(x,y)\land\phi_{S}(x)\}</math>.
:The right adjoint to the inverse image functor is given (without doing the computation here) by
::<math>
\forall_f S
= \{ y \in Y \mid \forall (x \in f^{-1} [\{y\}]).\, x \in S \; \}.
</math>
: The subset <math>\forall_f S</math> of <math>Y</math> is characterized as the full set of <math>y</math>'s with the property that the inverse image of <math>\{y\}</math> with respect to <math>f</math> is fully contained within <math>S</math>. Note how the predicate determining the set is the same as above, except that <math>\exists</math> is replaced by <math>\forall</math>.

:''See also [[powerset]].''

== Properties ==

===Existence===

{{anchor|Freyd's adjoint functor theorem}}Not every functor ''G'' : ''C'' → ''D'' admits a left adjoint. If ''C'' is a [[complete category]], then the functors with left adjoints can be characterized by the '''adjoint functor theorem''' of [[Peter J. Freyd]]: ''G'' has a left adjoint if and only if it is [[limit (category theory)#Preservation of limits|continuous]] and a certain smallness condition is satisfied: for every object ''Y'' of ''D'' there exists a family of morphisms

:''f''''i'' : ''Y'' → ''G''(''Xi'')

where the indices ''i'' come from a ''set'' ''I'', not a ''[[class (set theory)|proper class]]'', such that every morphism

:''h'' : ''Y'' → ''G''(''X'')

can be written as

:''h'' = ''G''(''t'') o ''f''''i''

for some ''i'' in ''I'' and some morphism

:''t'' : ''X''''i'' → ''X'' in ''C''.

An analogous statement characterizes those functors with a right adjoint.

===Uniqueness===

If the functor ''F'' : ''C'' ← ''D'' has two right adjoints ''G'' and ''G''′, then ''G'' and ''G''′ are [[natural transformation|naturally isomorphic]]. The same is true for left adjoints.

Conversely, if ''F'' is left adjoint to ''G'', and ''G'' is naturally isomorphic to ''G''′ then ''F'' is also left adjoint to ''G''′. More generally, if 〈''F'', ''G'', ε, η〉 is an adjunction (with counit-unit (ε,η)) and
:σ : ''F'' → ''F''′
:τ : ''G'' → ''G''′
are natural isomorphisms then 〈''F''′, ''G''′, ε′, η′〉 is an adjunction where
:<math>\begin{align}
\eta' &= (\tau\ast\sigma)\circ\eta \\
\varepsilon' &= \varepsilon\circ(\sigma^{-1}\ast\tau^{-1}).
\end{align}</math>
Here <math>\circ</math> denotes vertical composition of natural transformations, and <math>\ast</math> denotes horizontal composition.

===Composition===

Adjunctions can be composed in a natural fashion. Specifically, if 〈''F'', ''G'', ε, η〉 is an adjunction between ''C'' and ''D'' and 〈''F''′, ''G''′, ε′, η′〉 is an adjunction between ''D'' and ''E'' then the functor
:<math>F' \circ F : \mathcal{C} \leftarrow \mathcal{E}</math>
is left adjoint to
:<math>G \circ G' : \mathcal{C} \to \mathcal{E}.</math>
More precisely, there is an adjunction between ''F''′ ''F'' and ''G'' ''G''′ with unit and counit given by the compositions:
:<math>\begin{align}
&1_{\mathcal E} \xrightarrow{\eta} G F \xrightarrow{G \eta' F} G G' F' F \\
&F' F G G' \xrightarrow{F' \varepsilon G'} F' G' \xrightarrow{\varepsilon'} 1_{\mathcal C}.
\end{align}</math>
This new adjunction is called the '''composition''' of the two given adjunctions.

One can then form a category whose objects are all [[small category|small categories]] and whose morphisms are adjunctions.

===Limit preservation===

The most important property of adjoints is their continuity: every functor that has a left adjoint (and therefore ''is'' a right adjoint) is ''continuous'' (i.e. commutes with [[limit (category theory)|limits]] in the category theoretical sense); every functor that has a right adjoint (and therefore ''is'' a left adjoint) is ''cocontinuous'' (i.e. commutes with [[limit (category theory)|colimits]]).

Since many common constructions in mathematics are limits or colimits, this provides a wealth of information. For example:
* applying a right adjoint functor to a [[product (category theory)|product]] of objects yields the product of the images;
* applying a left adjoint functor to a [[coproduct]] of objects yields the coproduct of the images;
* every right adjoint functor is [[left exact functor|left exact]];
* every left adjoint functor is [[right exact functor|right exact]].

===Additivity===

If ''C'' and ''D'' are [[preadditive categories]] and ''F'' : ''C'' ← ''D'' is an [[additive functor]] with a right adjoint ''G'' : ''C'' → ''D'', then ''G'' is also an additive functor and the hom-set bijections
:<math>\Phi_{Y,X} : \mathrm{hom}_{\mathcal C}(FY,X) \cong \mathrm{hom}_{\mathcal D}(Y,GX)</math>
are, in fact, isomorphisms of abelian groups. Dually, if ''G'' is additive with a left adjoint ''F'', then ''F'' is also additive.

Moreover, if both ''C'' and ''D'' are [[additive categories]] (i.e. preadditive categories with all finite [[biproduct]]s), then any pair of adjoint functors between them are automatically additive.

==Relationships==

===Universal constructions===

As stated earlier, an adjunction between categories ''C'' and ''D'' gives rise to a family of [[universal morphism]]s, one for each object in ''C'' and one for each object in ''D''. Conversely, if there exists a universal morphism to a functor ''G'' : ''C'' → ''D'' from every object of ''D'', then ''G'' has a left adjoint.

However, universal constructions are more general than adjoint functors: a universal construction is like an optimization problem; it gives rise to an adjoint pair if and only if this problem has a solution for every object of ''D'' (equivalently, every object of ''C'').

===Equivalences of categories===

If a functor ''F'': ''C''←''D'' is one half of an [[equivalence of categories]] then it is the left adjoint in an adjoint equivalence of categories, i.e. an adjunction whose unit and counit are isomorphisms.

Every adjunction 〈''F'', ''G'', ε, η〉 extends an equivalence of certain subcategories. Define ''C''1 as the full subcategory of ''C'' consisting of those objects ''X'' of ''C'' for which ε''X'' is an isomorphism, and define ''D''1 as the [[full subcategory]] of ''D'' consisting of those objects ''Y'' of ''D'' for which η''Y'' is an isomorphism. Then ''F'' and ''G'' can be restricted to ''D''1 and ''C''1 and yield inverse equivalences of these subcategories.

In a sense, then, adjoints are "generalized" inverses. Note however that a right inverse of ''F'' (i.e. a functor ''G'' such that ''FG'' is naturally isomorphic to 1''D'') need not be a right (or left) adjoint of ''F''. Adjoints generalize ''two-sided'' inverses.

===Monads===

Every adjunction 〈''F'', ''G'', ε, η〉 gives rise to an associated [[monad (category theory)|monad]] 〈''T'', η, μ〉 in the category ''D''. The functor
:<math>T : \mathcal{D} \to \mathcal{D}</math>
is given by ''T'' = ''GF''. The unit of the monad
:<math>\eta : 1_{\mathcal{D}} \to T</math>
is just the unit η of the adjunction and the multiplication transformation
:<math>\mu : T^2 \to T\,</math>
is given by μ = ''G''ε''F''. Dually, the triple 〈''FG'', ε, ''F''η''G''〉 defines a [[comonad]] in ''C''.

Every monad arises from some adjunction—in fact, typically from many adjunctions—in the above fashion. Two constructions, called the category of [[Eilenberg–Moore algebra]]s and the [[Kleisli category]] are two extremal solutions to the problem of constructing an adjunction that gives rise to a given monad.

==References==
<references />
*{{cite book | last1 = Adámek | first1 = Jiří | first2=Horst | last2=Herrlich | first3=George E. | last3=Strecker | year = 1990 | url = http://katmat.math.uni-bremen.de/acc/acc.pdf | title = Abstract and Concrete Categories. The joy of cats | publisher = John Wiley & Sons | isbn = 0-471-60922-6 | zbl=0695.18001 }}
*{{cite book | first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | series = [[Graduate Texts in Mathematics]] | volume=5 | edition = 2nd | publisher = Springer-Verlag | isbn = 0-387-98403-8 | zbl=0906.18001 }}

==External links==
*[http://www.youtube.com/view_play_list?p=54B49729E5102248 Adjunctions] Seven short lectures on adjunctions.
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
{{Functors}}

{{DEFAULTSORT:Adjoint Functors}}
[[Category:Adjoint functors| ]]

Wolfram SystemModeler

2014-05-07T12:12:08Z

Magmalex: added link to Modelica page

{{Infobox Software
| name = Wolfram SystemModeler
| logo = [[Image:WolframSystemModelerLogo.png|64px|Wolfram SystemModeler logo]]
| developer = [[Wolfram Research]] |
| latest_release_version = 3.0 |
| latest_release_date = 23 May 2012|
| operating_system = [[Microsoft Windows|Windows]] and [[Mac OS X|OS X]] |
| genre = [[Object-oriented programming]] |
| license = [[Proprietary software|Proprietary]] |
| website = [http://www.wolfram.com/system-modeler Wolfram SystemModeler] |
}}
'''Wolfram SystemModeler''', developed by Wolfram MathCore, is a platform for engineering as well as life science modeling and simulation based on the [[Modelica]] language. It provides an interactive graphical modeling and simulation environment and a customizable set of component libraries.

== Features ==
Features of ''Wolfram SystemModeler'' include{{Citation needed|date=May 2012}}:
* Based on the non-proprietary, object-oriented, equation based, [[Modelica|Modelica language]].
* Graphical user interface for drag-and-drop modeling
* Textual user interface for equation-based Modelica modeling, simulation, documentation, and analysis
* A-causal (component based) and causal (block based) modeling
* Multidomain modeling, including:
** 1D and 3D mechanics
** Electrics
** Hydraulics
** Thermodynamics
** Controls engineering
** Systems biology
* Integration with ''Mathematica'' for analysis and documentation of models and simulations

== Interface ==
''Wolfram SystemModeler's'' primary interface, Model Center, is an interactive graphical environment including a customizable set of component libraries. Models developed in Model Center can be simulated in Simulation Center. The software also provides a tight integration with the [[Mathematica]] environment. This allows users that have both software to develop, simulate, document, and analyze their ''Wolfram SystemModeler'' models within ''Mathematica'' notebooks. The software is used within the engineering field as well as within life science.

== Editions ==
Originally developed by MathCore Engineering as ''MathModelica'' it was acquired by [[Wolfram Research]] on March 30, 2011<ref>{{cite web|last=Wolfram|first=Stephen|title=Launching a New Era in Large-Scale Systems Modeling|url=http://blog.wolfram.com/2011/03/30/launching-a-new-era-in-large-scale-systems-modeling/}}</ref> and re-released as ''Wolfram SystemModeler'' on 23 May 2012 <ref>{{cite web|last=Wolfram|first=Stephen|title=Launching a New Era in Large-Scale Systems Modeling|url=http://blog.wolfram.com/2011/03/30/launching-a-new-era-in-large-scale-systems-modeling/}}</ref><ref>{{cite web|title=Model the World with Wolfram’s New Software for Engineers|url=http://venturebeat.com/2012/05/23/model-the-world-with-wolframs-new-software-for-engineers/}}</ref><ref>{{cite web|url=http://www.pcworld.com/businesscenter/article/256093/wolfram_expands_into_system_modeling.html|title=Wolfram Expands into System Modeling}}</ref><ref>{{cite web|url=http://www.geeky-gadgets.com/wolfram-new-systemmodeler-software-helps-engineers-create-complex-simulations-24-05-2012/|title=Wolfram New SystemModeler Software Helps Engineers Create Complex Simulations}}</ref><ref>{{cite web|url=http://www.webpronews.com/wolfram-systemmodeler-takes-on-large-scale-system-modeling-2012-05|title=Wolfram SystemModeler Takes On Large Scale System Modeling}}</ref> with improved integration with [[Wolfram Research]]'s [[Mathematica]] software.

== Modeling Language ==
''Wolfram SystemModeler'' uses the free object-oriented modeling language [[Modelica]]. Modelica is a non-proprietary, object-oriented, equation based language to conveniently model complex physical systems containing, e.g., mechanical, electrical, electronic, hydraulic, thermal, control, electric power or process-oriented subcomponents.

== Release history ==

{| class="wikitable" style="font-size: 90%; text-align: left; "
|-
! Name/Version !! Date
|-
| SystemModeler 1.0
| March 2011
|-
| SystemModeler 3.0
| May 2012
|}

==See also==
* [[AMESim]]
* [[APMonitor]]
* [[Modelica]]
* [[Mathematica]]
* [[Scientific modelling|Modelling]]
* [[Simulation]]
* [[Computer simulation]]
* [[Dymola]]
* [[SimulationX]]

== Licensing ==
Wolfram SystemModeler is [[proprietary software]] protected by both [[trade secret]] and [[copyright]] law.

== References ==
{{reflist|2}}

==External links==
*[http://www.mathcore.com Wolfram MathCore, original developer of MathModelica]
*[http://www.wolfram.com Wolfram Research, developer of Mathematica]

[[Category:Object-oriented programming]]
[[Category:Wolfram Research]]

{{software-stub}}

Invariant (mathematics)

2014-04-26T16:34:03Z

Magmalex: /* Invariant set */ Specified mapping T

In [[mathematics]], an '''invariant''' is a property of a class of mathematical objects that remains unchanged when [[Transformation (function)|transformations]] of a certain type are applied to the objects. The particular class of objects and type of transformations are usually indicated by the context in which the term is used. For example, the area of a triangle is an invariant with respect to [[isometry|isometries]] of the Euclidean plane. The phrases "invariant under" and "invariant to" a transformation are both used. More generally, an invariant with respect to an [[equivalence relation]] is a property that is constant on each equivalence class.

Invariants are used in diverse areas of mathematics such as [[geometry]], [[topology]] and [[algebra]]. Some important classes of transformations are defined by an invariant they leave unchanged, for example [[conformal map]]s are defined as transformations of the plane that preserve angles. The discovery of invariants is an important step in the process of classifying mathematical objects.

== Simple examples ==
The most fundamental example of invariance is expressed in our ability to count. For a finite collection of objects of any kind, there appears to be a number to which we invariably arrive, regardless of how we count the objects in the set. The quantity—a [[cardinal number]]—is associated with the set, and is invariant under the process of counting.

An [[List of mathematical identities|identity]] is an equation that remains true for all values of its variables. There are also [[List of inequalities|inequalities]] that remain true when the values of their variables change.

Another simple example of invariance is that the [[distance]] between two points on a [[number line]] is not changed by [[addition|adding]] the same quantity to both numbers. On the other hand, [[multiplication]] does not have this property, so distance is not invariant under multiplication.

[[Angle]]s and [[ratio]]s of distances are invariant under [[Scaling (geometry)|scalings]], [[Rotation (mathematics)|rotation]]s, [[Translation (geometry)|translation]]s and [[Reflection (mathematics)|reflection]]s. These transformations produce [[Similarity (geometry)|similar]] shapes, which is the basis of [[trigonometry]]. All circles are similar. Therefore they can be transformed into each other and the ratio of the [[circumference]] to the [[diameter]] is invariant and equal to [[pi]].

== More advanced examples ==
Some more complicated examples:
* The [[real part]] and the [[absolute value]] of a [[complex number]] are invariant under [[complex conjugation]].
* The degree of a polynomial is invariant under linear change of variables.
* The dimension and homology groups of a topological object are invariant under [[homeomorphism]].<ref>{{harvtxt|Fraleigh|1976|pp=166–167}}</ref>
* The number of [[fixed point (mathematics)|fixed points]] of a [[dynamical system]] is invariant under many mathematical operations.
* Euclidean distance is invariant under [[orthogonal matrix|orthogonal transformations]].
* Euclidean [[area]] is invariant under a [[linear map]] with [[determinant]] 1 (see [[2 × 2 real matrices#Equi-areal mapping|Equi-areal maps]]).
* Some invariants of [[projective transformation]]s: [[collinearity]] of three or more points, [[concurrent lines|concurrency]] of three or more lines, [[conic section]]s, the [[cross-ratio]].<ref>{{harvtxt|Kay|1969|pp=219}}</ref>
* The [[determinant]], [[Trace (linear algebra)|trace]], and [[eigenvectors]] and [[eigenvalues]] of a square matrix are invariant under changes of basis. In a word, the [[spectrum of a matrix]] is invariant to the change of basis.
* [[Invariants of tensors]].
* The [[singular-value decomposition|singular values]] of a matrix are invariant under orthogonal transformations.
* [[Lebesgue measure]] is invariant under translations.
* The [[variance]] of a [[probability distribution]] is invariant under translations of the [[real number|real]] line; hence the variance of a [[random variable]] is unchanged by the addition of a constant to it.
* The [[fixed point (mathematics)|fixed points]] of a transformation are the elements in the domain invariant under the transformation. They may, depending on the application, be called [[symmetry|symmetric]] with respect to that transformation. For example, objects with [[translational symmetry]] are invariant under certain translations.
*The integral <math>\textstyle{\int_M K\,d\mu}</math> of the Gaussian curvature ''K'' of a 2-dimensional Riemannian manifold (''M'',''g'') is invariant under changes of the [[Riemannian metric]] ''g''. This is the [[Gauss-Bonnet Theorem]].

==Invariant set==
A subset ''S'' of the domain ''U'' of a mapping ''T'': ''U'' → ''U'' is an '''invariant set''' under the mapping when <math>x \in S \Rightarrow T(x) \in S.</math> Note that the [[element (mathematics)|elements]] of ''S'' are not [[Fixed point (mathematics)|fixed]], but rather the set ''S'' is fixed in the [[power set]] of ''U''.
For example, a [[circle]] is an invariant subset of the plane under a [[rotation]] about the circle’s center. Further, a [[conical surface]] is invariant as a set under a [[homothety]] of space.

An invariant set of an operation ''T'' is also said to be '''stable under''' ''T''. For example, the [[normal subgroup]]s that are so important in [[group theory]] are those [[subgroup]]s that are stable under the [[inner automorphism]]s of the ambient group.<ref>{{harvtxt|Fraleigh|1976|p=103}}</ref><ref>{{harvtxt|Herstein|1964|p=42}}</ref><ref>{{harvtxt|McCoy|1968|p=183}}</ref>
Other examples occur in [[linear algebra]]. Suppose a [[linear transformation]] ''T'' has an [[eigenvector]] '''v'''. Then the line through 0 and '''v''' is an invariant set under ''T''. The eigenvectors span an [[invariant subspace]] which is stable under ''T''.

When ''T'' is a [[screw displacement]], the [[screw axis]] is an invariant line, though if the [[pitch (screw)|pitch]] is non-zero, ''T'' has no fixed points.

== Formal statement ==
{{unreferenced section|date=February 2010}}
The notion of invariance is formalized in three different ways in mathematics: via [[group action]]s, presentations, and deformation.

=== Unchanged under group action ===
Firstly, if one has a group ''G'' acting on a mathematical object (or set of objects) ''X,'' then one may ask which points ''x'' are unchanged, "invariant" under the group action, or under an element ''g'' of the group.

Very frequently one will have a group acting on a set ''X'' and ask which objects in an ''associated'' set ''F''(''X'') are invariant. For example, rotation in the plane about a point leaves the point about which it rotates invariant, while translation in the plane does not leave any points invariant, but does leave all lines parallel to the direction of translation invariant as lines. Formally, define the set of lines in the plane ''P'' as ''L''(''P''); then a rigid motion of the plane takes lines to lines – the group of rigid motions acts on the set of lines – and one may ask which lines are unchanged by an action.

More importantly, one may define a ''function'' on a set, such as "radius of a circle in the plane" and then ask if this function is invariant under a group action, such as rigid motions.

Dual to the notion of invariants are ''[[coinvariant]]s,'' also known as ''orbits,'' which formalizes the notion of [[congruence relation|congruence]]: objects which can be taken to each other by a group action. For example, under the group of rigid motions of the plane, the perimeter of a triangle is an invariant, while the set of triangles congruent to a given triangle is a coinvariant.

These are connected as follows: invariants are constant on coinvariants (for example, congruent triangles have the same perimeter), while two objects which agree in the value of one invariant may or may not be congruent (two triangles with the same perimeter need not be congruent). In [[classification problem (mathematics)|classification problem]]s, one seeks to find a [[complete set of invariants]], such that if two objects have the same values for this set of invariants, they are congruent. For example, triangles such that all three sides are equal are congruent, via SSS congruence, and thus the length of all three sides forms a complete set of invariants for triangles.

=== Independent of presentation ===
Secondly, a function may be defined in terms of some presentation or decomposition of a mathematical object; for instance, the [[Euler characteristic]] of a [[cell complex]] is defined as the alternating sum of the number of cells in each dimension. One may forget the cell complex structure and look only at the underlying topological space (the manifold) – as different cell complexes give the same underlying manifold, one may ask if the function is ''independent'' of choice of ''presentation,'' in which case it is an ''intrinsically'' defined invariant. This is the case for the Euler characteristic, and a general method for defining and computing invariants is to define them for a given presentation and then show that they are independent of the choice of presentation. Note that there is no notion of a group action in this sense.

The most common examples are:
* The [[Differentiable manifold#Definition|presentation of a manifold]] in terms of coordinate charts – invariants must be unchanged under [[change of coordinates]].
* Various [[manifold decomposition]]s, as discussed for Euler characteristic.
* Invariants of a [[presentation of a group]].

=== Unchanged under perturbation ===
Thirdly, if one is studying an object which varies in a family, as is common in [[algebraic geometry]] and [[differential geometry]], one may ask if the property is unchanged under perturbation – if an object is constant on families or invariant under change of metric, for instance.

==See also==
* [[Erlangen program]]
* [[Invariant (physics)]]
* [[Invariant estimator]] in statistics
* [[Invariant theory]]
* [[Symmetry in mathematics]]
* [[Topological invariant]]
* [[Invariant differential operator]]
* [[Invariant measure]]
* [[Mathematical constant]]
* [[Mathematical constants and functions]]

==Notes==

{{Reflist}}

==References==
{{Refbegin}}
* {{ citation | first1 = John B. | last1 = Fraleigh | year = 1976 | isbn = 0-201-01984-1 | title = A First Course In Abstract Algebra | edition = 2nd | publisher = [[Addison-Wesley]] | location = Reading }}
* {{ citation | first1 = I. N. | last1 = Herstein | year = 1964 | isbn = 978-1114541016 | title = Topics In Algebra | publisher = [[Blaisdell Publishing Company]] | location = Waltham }}
* {{ citation | first1 = David C. | last1 = Kay | year = 1969 | lccn = 69-12075 | title = College Geometry | publisher = [[Holt, Rinehart and Winston]] | location = New York }}
* {{ citation | first1 = Neal H. | last1 = McCoy | year = 1968 | title = Introduction To Modern Algebra, Revised Edition | publisher = [[Allyn and Bacon]] | location = Boston | lccn = 68-15225 }}
*{{MathWorld|title=Invariant|urlname=Invariant}}
*{{springer|title=Invariant|id=I/i052200|last=Popov|first=V.L.|authorlink=Vladimir L. Popov}}
{{Refend}}

==External links==
* [[Planet Math]] [http://planetmath.org/encyclopedia/Invariant.html Invariant]

{{DEFAULTSORT:Invariant (Mathematics)}}
[[Category:Mathematical terminology]]

Aristotelian physics

2013-10-03T14:20:32Z

Magmalex: /* Motion */

{{Use dmy dates|date=June 2013}}
{{lead rewrite|date=February 2012}}
'''Aristotelian Physics''', the [[natural sciences]], are described in the works of the [[Ancient Greek philosophy|Greek philosopher]] [[Aristotle]] (384 BC – 322 BC). In the ''[[Physics (Aristotle)|Physics]]'', Aristotle established general principles of change that govern all natural bodies; both living and inanimate, celestial and terrestrial—including all motion, change in respect to place, change in respect to size or number, qualitative change of any kind, and coming to be and passing away. As Martin Heidegger, one of the foremost philosophers of the twentieth century, once wrote,

{{quote|Aristotelian "physics" is different from what we mean today by this word, not only to the extent that it belongs to antiquity whereas the modern physical sciences belong to modernity, rather above all it is different by virtue of the fact that Aristotle's "physics" is philosophy, whereas modern physics is a positive science that presupposes a philosophy.... This book determines the warp and woof of the whole of Western thinking, even at that place where it, as modern thinking, appears to think at odds with ancient thinking. But opposition is invariably {{sic|hide=y|comprised |of}} a decisive, and often even perilous, dependence. Without Aristotle's ''Physics'' there would have been no Galileo.<ref>Martin Heidegger, ''The Principle of Reason'', trans. Reginald Lilly, (Indiana University Press, 1991), 62-[http://books.google.com/books?id=rWDUmlA6M98C&lpg=PP1&pg=PA63#v=onepage&q&f=false 63].</ref>}}

To Aristotle, physics was a broad term that includes all nature sciences, such as philosophy of mind, body, sensory experience, memory and biology, and constitutes the foundational thinking underlying [[Corpus_Aristotelicum#Physics_.28the_study_of_nature.29|many]] of his works.

== Ancient concepts ==
Some concepts involved in Aristotle's physics are:
#'''[[Teleology]]''': Aristotle observes that natural things tend toward definite goals or ends insofar as they are natural. For example, a seed, under normal circumstances, has the goal (telos) of becoming an adult plant. <ref>{{cite book|last=Aristotle|title=Parts of Animals I.1}}</ref> Regularities manifest a rudimentary kind of teleology.
# '''Natural motion''': Terrestrial objects tend toward a different part of the universe according to their composition of the four elements. For example, earth, the heaviest element, tends toward the center of the universe—hence the reason for the Earth being at the center. At the opposite extreme the lightest element, fire, tends upward, away from the center. The relative proportion of the four elements composing an object determines its motion. The elements are not proper ''[[Substance theory|substances]]'' in Aristotelian theory or the modern sense of the word. Refining an arbitrarily pure sample of an element isn't possible; They were [[abstraction]]s; one might consider an arbitrarily pure sample of a terrestrial substance having a large [[ratio]] of one element relative to the others.
# '''Terrestrial motion''': Terrestrial objects move [[gravity|downward]] or [[buoyancy|upward]] toward their natural place. Motion from side to side results from the turbulent collision and sliding of the objects as well as transformations between the elements, (generation and corruption).
# '''Rectilinear motion''': Ideal terrestrial motion would proceed straight up or straight down at [[Newton's laws of motion#Newton's first law|constant speed]]. Celestial motion is always ideal, it is circular and its speed is constant.
# '''Speed, weight and resistance''': The ideal speed of a terrestrial object is [[directly proportional]] to its weight. In nature, however, the matter obstructing an object's path is a limiting factor that's [[inversely proportional]] to the [[viscosity]] of the medium.
# '''Vacuum isn't possible''': [[Vacuum]] doesn't occur, but hypothetically, terrestrial motion in a vacuum would be indefinitely fast.

# '''Continuum''': Aristotle argues against the ''indivisibles'' of [[Democritus]] (which differ considerably from the [[Corpuscularianism|historical]] and the [[Atomic theory|modern]] use of the term ''[[atom]]'').
# '''Aether''': The "greater and lesser lights of heaven", (the sun, moon, planets and stars), are embedded in perfectly concentric [[celestial spheres|''crystal'' spheres]] that rotate eternally at fixed rates. Because the spheres never change and ([[meteorite]]s notwithstanding) don't fall down or rise up from the ground, they cannot be composed of the four terrestrial elements. Much as [[Homer]]'s ''[[aether (mythology)|æthere (αἰθήρ)]]'', the "pure air" of [[Mount Olympus]] was the divine counterpart of the air (άήρ, ''aer'') breathed by [[immortality|mortal]]s, the celestial spheres are composed of a special element, eternal and unchanging, with circular natural motion.
# '''Terrestrial change''': {{anchor|celestial unchanging}}[[File:Four elements representation.svg|thumb|right|The four terrestrial elements]] Unlike the eternal and unchanging celestial ''[[Aether (classical element)|aether]]'', each of the four terrestrial elements are capable of changing into either of the two elements they share a property with: e.g. the cold and wet (''[[Water (classical element)|water]]'') can transform into the hot and wet (''[[Air (classical element)|air]]'') or the cold and dry (''[[Earth (classical element)|earth]]'') and any apparent change into the hot and dry (''[[Fire (classical element)|fire]]'') is actually a [[Gray code|two step]] process. These properties are predicated of an actual substance relative to the work it's able to do; that of heating or chilling and of desiccating or moistening. The four elements exist ''only'' with regard to this capacity and relative to some potential work. The celestial element is eternal and unchanging, so only the four terrestrial elements account for ''coming to be'' and ''passing away''; also called ''"generation and corruption"'' after the Latin title of Aristotle's [[On Generation and Corruption|''De Generatione et Corruptione'' (Περὶ γενέσεως καὶ φθορᾶς)]].
# '''Celestial motion''': The ''crystal spheres'' carrying the sun, moon and stars move eternally with unchanging circular motion. They're composed of solid ''[[Aether (classical element)|aether]]'' and no gaps exist between the spheres. Spheres are embedded within spheres to account for the ''wandering stars'', (i.e. the modern [[planet]]s, which appear to move erratically in comparison to the sun, moon and stars). Later, the belief that all spheres are concentric was forsaken in favor of [[Ptolemy]]'s ''[[deferent and epicycle]]''. Aristotle submits to the calculations of [[astronomer]]s regarding the total number of spheres and various accounts give a number in the neighborhood of 50 spheres. An ''[[unmoved mover]]'' is assumed for each sphere, including a ''[[primum movens|prime mover]]'' for the ''sphere of [[fixed stars]]''. The ''unmoved movers'' do not push the spheres (nor could they, they're insubstantial and dimensionless); rather, they're the [[four causes|final cause]] of the motion, meaning they explain it in a way that's similar to the explanation "the soul is moved by beauty". They simply "think about thinking", eternally without change, which is the ''[[Hylomorphism|idea]]'' of [[Metaphysics (Aristotle)|"being ''qua'' being"]] in Aristotle reformulation of [[Theory of Forms|Plato's theory]].

While consistent with common human experience, Aristotle's principles were not based on controlled, quantitative experiments, so, while they account for many broad features of nature, they do not describe our universe in the precise, quantitative way we have more recently come to expect from science. Contemporaries of Aristotle like [[Aristarchus of Samos|Aristarchus]] rejected these principles in favor of [[heliocentrism]], but their ideas were not widely accepted. Aristotle's principles were difficult to disprove merely through casual everyday observation, but later development of the [[scientific method]] challenged his views with [[experiment]]s, careful measurement, and more advanced technology such as the [[telescope]] and [[vacuum pump]].

=== Elements ===
According to Aristotle, the [[classical element|elements]] which compose the terrestrial spheres are different from the one that composes the celestial spheres.<ref name="utk">{{Cite web|url=http://csep10.phys.utk.edu/astr161/lect/history/aristotle_dynamics.html|title=Physics of Aristotle vs. The Physics of Galileo|accessdate=6 April 2009| archiveurl= http://web.archive.org/web/20090411072007/http://csep10.phys.utk.edu/astr161/lect/history/aristotle_dynamics.html| archivedate= 11 April 2009 | deadurl= no}}</ref> He believed that four elements make up everything under the moon (the terrestrial): [[earth (classical element)|earth]], [[air (classical element)|air]], [[fire (classical element)|fire]] and [[water (classical element)|water]].{{Ref_label|A|a|none}}<ref name="edu">{{Cite web|url=http://www.hep.fsu.edu/~wahl/Quarknet/pepperlect/aristogalnewt.pdf|format=PDF|title=www.hep.fsu.edu|accessdate=26 March 2007}}</ref>
He also held that the heavens are made of a special, fifth element called "[[Aether (classical element)|aether]]",<ref name="edu" /> which is weightless and "incorruptible" (which is to say, it doesn't change).<ref name="edu" /> Aether is also known by the name "quintessence"—literally, "fifth substance".<ref name="ari">{{Cite web|url=http://aether.lbl.gov/www/classes/p10/aristotle-physics.html|title=Aristotle's physics|accessdate=6 April 2009}}</ref>
[[File:Aristotle Physica page 1.png|thumb|left|200px|Page from an 1837 edition of ''Physica'' by the ancient Greek philosopher [[Aristotle]]—a book about a variety of subjects including the [[philosophy]] of nature and some topics within [[physics]]]]

He considered heavy substances such as [[iron]] and other metals to consist primarily of the element ''earth'', with a smaller amount of the other three terrestrial elements. Other, lighter objects, he believed, have less earth, relative to the other three elements in their composition.<ref name="ari" />



=== Motion ===

Aristotle held that each of the four terrestrial (or ''worldly'') elements move toward their ''natural place'', and that this natural motion would proceed unless hindered. For instance, because [[smoke]] is mainly ''air'', it rises toward the sky but not as high as ''fire''.

Motion and change are closely related in Aristotelian physics. Motion, according to [[Aristotle]] involve change from [[potentiality]] to [[actuality]].<ref>{{Bodnar, Istvan, "Aristotle's Natural Philosophy", The Stanford Encyclopedia of Philosophy (Spring 2012 Edition), Edward N. Zalta (ed.), URL = <http://plato.stanford.edu/archives/spr2012/entries/aristotle-natphil/>.}}</ref> He gave example of four types of change.

Aristotle proposed that velocity is directly proportional to the weight of an object which will make objects of different weight fall to the ground at different time. [[Galileo]] would later contest Aristotle’s point by demonstrating that object of different weights reach the ground in a similar time.<ref>Lindberg, D. (2008) ''The beginnings of western science: The European scientific tradition in philosophical, religious, and institutional context, prehistory to a.d. 1450'' (2nd ed.) [[University of Chicago Press]]</ref>

===Vacuum===

A vacuum, or void, is a place free of everything, and Aristotle argued against the possibility. Aristotle believed that the speed of an object's motion is proportional to the force being applied (or the object's weight in the case of natural motion) and inversely proportional to the viscosity of the medium; the more tenuous a medium is, the faster the motion. He reasoned that objects moving in a void, could move indefinitely fast and thus, the objects surrounding a void would immediately fill it before it could actually form.<ref>Land, Helen ''The Order of Nature in Aristotle's Physics: Place and the Elements'' (1998)</ref> In astronomy, [[Void (astronomy)|voids]], such as the [[Local Void]] adjacent to our galaxy, have the opposite effect; off-center bodies are ejected from the void due to the gravity of the material outside, which being the farthest away in a direction towards the center, is also at its weakest.<ref>{{cite journal |author=Tully |author2=Shaya |author3=Karachentsev |author4=Courtois |author5=Kocevski |author6=Rizzi |author7=Peel |year=2008 |title=Our Peculiar Motion Away From the Local Void |journal=The Astrophysical Journal |volume=676 |number=1 |url=http://stacks.iop.org/0004-637X/676/i=1/a=184 |pages=184|bibcode = 2008ApJ...676..184T |doi = 10.1086/527428 |arxiv = 0705.4139 }}</ref>

=== Natural place ===

The Aristotelian explanation of gravity is that all bodies move toward their ''natural place''. For the element ''earth'', that place is the center of the ([[geocentric]]) universe, next comes the natural place of ''water'' (in a concentric shell around that of ''earth''). The natural place of ''air'' is likewise a concentric shell surrounding the place of ''water''. [[Sea level]] is between those two. Finally, the natural place of ''fire'' is higher than that of ''air'' but below the innermost celestial sphere, (the one carrying the Moon). Even at locations well above sea level, such as a mountain top, an object made mostly of the former two elements tends to fall and objects made mostly of the latter two tend to rise.

=== Place (topos) ===

In Book ''Delta'' of his ''Physics'' (IV.5) Aristotle defines topos (place) as the inner two-dimensional surface boundary of the containing body that is in touch with the outer two-dimensional surface of the contained body. This definition was the most dominant until the beginnings of the 17th century, even though it was called into doubt and debated by philosophers since antiquity, as for instance discussed by [[Simplicius]] in his ''Corollaries on place''. The most significant early serious critique to this conception of place was demonstrated geometrically by the 11th century Arab polymath al-Hasan [[Ibn al-Haytham]] ([[Alhazen]]) in his ''Discourse on place''.<ref>{{cite journal|last=El-Bizri|first=Nader|title=In Defence of the Sovereignty of Philosophy: al-Baghdadi's Critique of Ibn al-Haytham's Geometrisation of Place|journal=Arabic Sciences and Philosophy (Cambridge University Press)|year=2007|volume=17|pages=57–80}}</ref>

==Medieval commentary==
{{Cite check|section|date=September 2010}}
{{Main|Theory of impetus}}

The Aristotelian theory of motion came under criticism and/or modification during the [[Middle Ages]]. The first such modification came from [[John Philoponus]] in the 6th century. He partly accepted Aristotle's theory that "continuation of motion depends on continued action of a force," but modified it to include his idea that the hurled body acquires a motive power or inclination for forced movement from the agent producing the initial motion and that this power secures the continuation of such motion. However, he argued that this impressed virtue was temporary; that it was a self-expending inclination, and thus the violent motion produced comes to an end, changing back into natural motion. In the 11th century, the Persian polymath [[Avicenna]], in ''[[The Book of Healing]]'' (1027) was influenced by Philoponus' theory in its rough outline, but took it much further to present the first alternative to the Aristotelian theory. In the [[Avicennism|Avicennan]] theory of motion, the violent inclination he conceived was non-self-consuming, a permanent force whose effect was dissipated only as a result of external agents such as air resistance, making him "the first to conceive such a permanent type of impressed virtue for non-natural motion." Such a self-motion (''mayl'') is "almost the opposite of the Aristotelian conception of violent motion of the projectile type, and it is rather reminiscent of the principle of [[inertia]], i.e., [[Newton's first law of motion]]."<ref>[[Aydin Sayili]] (1987), "Ibn Sīnā and Buridan on the Motion of the Projectile", ''Annals of the New York Academy of Sciences'' '''500''' (1): 477–482 [477]: {{quote|According to Aristotle, continuation of motion depends on continued action of a force. The motion of a hurled body, therefore, requires elucidation. Aristotle maintained that the air of the atmosphere was responsible for the continuation of such motion. John Philoponos of the 6th century rejected this Aristotelian view. He claimed that the hurled body acquires a motive power or an inclination for forced movement from the agent producing the initial motion and that this power or condition and not the ambient medium secures the continuation of such motion. According to Philoponos this impressed virtue was temporary. It was a self-expending inclination, and thus the violent motion thus produced comes to an end and changes into natural motion. Ibn Sina adopted this idea in its rough outline, but the violent inclination as he conceived it was a non-self-consuming one. It was a permanent force whose effect got dissipated only as a result of external agents such as air resistance. He is apparently the first to conceive such a permanent type of impressed virtue for non-natural motion. [...] Indeed, self-motion of the type conceived by Ibn Sina is almost the opposite of the Aristotelian conception of violent motion of the projectile type, and it is rather reminiscent of the principle of inertia, i.e., Newton's first law of motion.}}</ref>

The eldest [[Banū Mūsā]] brother, Ja'far Muhammad ibn Mūsā ibn Shākir (800-873), wrote the ''Astral Motion'' and ''The Force of Attraction''. The Persian physicist, [[Ibn al-Haytham]] (965-1039), discussed the theory of attraction between bodies. It seems that he was aware of the [[Magnitude (mathematics)|magnitude]] of [[acceleration]] due to [[gravity]] and he discovered that the heavenly bodies "were accountable to the [[Physical law|laws of physics]]".<ref>Duhem, Pierre (1908, 1969). ''To Save the Phenomena: An Essay on the Idea of Physical theory from Plato to Galileo'', p. 28. University of Chicago Press, Chicago.</ref> The Persian polymath [[Abū Rayhān al-Bīrūnī]] (973-1048) was the first to realize that [[acceleration]] is connected with non-uniform motion, part of [[Newton's second law of motion]].<ref name=Biruni>{{MacTutor|id=Al-Biruni|title=Al-Biruni}}</ref> During his debate with [[Avicenna]], al-Biruni also criticized the Aristotelian theory of gravity for denying the existence of [[wiktionary:levity|levity]] or gravity in the [[celestial sphere]]s and for its notion of [[circular motion]] being an [[Intrinsic and extrinsic properties|innate property]] of the [[Astronomical object|heavenly bodies]].<ref name=Berjak>Rafik Berjak and Muzaffar Iqbal, "Ibn Sina--Al-Biruni correspondence", ''Islam & Science'', June 2003.</ref>

In 1121, [[al-Khazini]], in ''The Book of the Balance of Wisdom'', proposed that the gravity and [[gravitational potential energy]] of a body varies depending on its distance from the centre of the Earth.<ref>Mariam Rozhanskaya and I. S. Levinova (1996), "Statics", in Roshdi Rashed, ed., ''[[Encyclopedia of the History of Arabic Science]]'', Vol. 2, p. 614-642 [621-622]. [[Routledge]], London and New York.</ref>{{Failed verification|date=May 2010}} [[Hibat Allah Abu'l-Barakat al-Baghdaadi]] (1080–1165) wrote a critique of Aristotelian physics entitled ''al-Mu'tabar'', where he negated Aristotle's idea that a constant [[force]] produces uniform motion, as he realized that a force applied continuously produces [[acceleration]], a fundamental law of [[classical mechanics]] and an early foreshadowing of [[Newton's second law of motion]].<ref>{{cite encyclopedia
| last = [[Shlomo Pines]]
| title = Abu'l-Barakāt al-Baghdādī , Hibat Allah
| encyclopedia = [[Dictionary of Scientific Biography]]
| volume = 1
| pages = 26–28
| publisher = Charles Scribner's Sons
| location = New York
| year = 1970
| isbn = 0-684-10114-9
}}
 ([[cf.]] Abel B. Franco (October 2003). "Avempace, Projectile Motion, and Impetus Theory", ''Journal of the History of Ideas'' '''64''' (4), p. 521-546 [528].)</ref> Like Newton, he described acceleration as the rate of change of [[speed]].<ref>A. C. Crombie, ''Augustine to Galileo 2'', p. 67.</ref>

In the 14th century, [[Jean Buridan]] developed the [[theory of impetus]] as an alternative to the Aristotelian theory of motion. The theory of impetus was a precursor to the concepts of [[inertia]] and [[momentum]] in classical mechanics.<ref>[[Aydin Sayili]] (1987), "Ibn Sīnā and Buridan on the Motion of the Projectile", ''Annals of the New York Academy of Sciences'' '''500''' (1): 477–482</ref> Buridan and [[Albert of Saxony (philosopher)|Albert of Saxony]] also refer to Abu'l-Barakat in explaining that the acceleration of a falling body is a result of its increasing impetus.<ref name=Gutman>{{Cite book|title=Pseudo-Avicenna, Liber Celi Et Mundi: A Critical Edition|first=Oliver|last=Gutman|publisher=[[Brill Publishers]]|year=2003|isbn=90-04-13228-7|page=193|ref=harv|postscript=}}</ref> In the 16th century, [[Al-Birjandi]] discussed the possibility of the [[Earth's rotation]]. In his analysis of what might occur if the Earth were rotating, he developed a hypothesis similar to [[Galileo Galilei]]'s notion of "circular inertia",<ref>{{Harv|Ragep|2001b|pp=63–4}}</ref> which he described in the following [[Experiment|observational test]]:

{{quote|"The small or large rock will fall to the Earth along the path of a line that is perpendicular to the plane (''sath'') of the horizon; this is witnessed by experience (''tajriba''). And this perpendicular is away from the tangent point of the Earth’s sphere and the plane of the perceived (''hissi'') horizon. This point moves with the motion of the Earth and thus there will be no difference in place of fall of the two rocks."<ref>{{Harv|Ragep|2001a|pp=152–3}}</ref>}}

== Life and death of Aristotelian physics ==
[[File:Rembrandt Harmensz. van Rijn 013.jpg|thumb|right|200px| The famous philosopher Aristotle, depicted in a painting by [[Rembrandt Harmensz. van Rijn]]]]
The reign of Aristotelian physics lasted for almost two millennia, and provides the earliest known speculative theories of physics. After the work of [[Galileo]], [[Descartes]], and many others, it became generally accepted that Aristotelian physics was not correct or viable.<ref name="ari"/>
Despite this, the scholastic science survived well into the seventeenth century, and perhaps even later, until universities amended their curricula.

In [[Europe]], [[Aristotle]]'s theory was first convincingly discredited by the work of [[Galileo Galilei]]. Using a [[telescope]], Galileo observed that the moon was not entirely smooth, but had craters and mountains, contradicting the Aristotelian idea of an incorruptible perfectly smooth moon. Galileo also criticized this notion theoretically – a perfectly smooth moon would reflect light unevenly like a shiny [[billiard ball]], so that the edges of the moon's disk would have a different brightness than the point where a tangent plane reflects sunlight directly to the eye. A rough moon reflects in all directions equally, leading to a disk of approximately equal brightness which is what is observed.<ref name=GalileoTwoSystems>Galileo Galilei, ''[[Dialogue Concerning the Two Chief World Systems]]''.</ref> Galileo also observed that [[Jupiter]] has [[Galilean moons|moons]], objects which revolve around a body other than the Earth. He noted the [[planetary phase|phases]] of Venus, convincingly demonstrating that Venus, and by implication Mercury, travels around the sun, not the Earth.

According to legend, Galileo dropped balls of various [[Density|densities]] from the [[Leaning Tower of Pisa|Tower of Pisa]] and found that lighter and heavier ones fell at almost the same speed. In fact, he did quantitative experiments with balls rolling down an inclined plane, a form of falling that is slow enough to be measured without advanced instruments.

A heavier body falls faster than a lighter one of the same shape in a dense medium like water, and this led Aristotle to speculate that the rate of falling is proportional to the weight and inversely proportional to the density of the medium. From his experience with objects falling in water, he concluded that water is approximately ten times denser than air. By weighing a volume of compressed air, Galileo showed that this overestimates the density of air by a factor of forty.<ref name=GalileoNewSciences>Galileo Galilei, ''[[Two New Sciences]]''.</ref> From his experiments with inclined planes, he concluded that all bodies fall at the same rate neglecting friction.

Galileo also advanced a theoretical argument to support his conclusion. He asked if two bodies of different weights and different rates of fall are tied by a string, does the combined system fall faster because it is now more massive, or does the lighter body in its slower fall hold back the heavier body? The only convincing answer is neither: all the systems fall at the same rate.<ref name=GalileoTwoSystems/>

Followers of Aristotle were aware that the motion of falling bodies was not uniform, but picked up speed with time. Since time is an abstract quantity, the [[Peripatetic school|peripatetic]]s postulated that the speed was proportional to the distance. Galileo established experimentally that the speed is proportional to the time, but he also gave a theoretical argument that the speed could not possibly be proportional to the distance. In modern terms, if the rate of fall is proportional to the distance, the differential equation for the distance y travelled after time t is
:<math>
{dy\over dt} = y
</math>
with the condition that <math>y(0)=0</math>. Galileo demonstrated that this system would stay at <math>y=0</math> for all time. If a perturbation set the system into motion somehow, the object would pick up speed exponentially in time, not quadratically.<ref name=GalileoNewSciences/>

Standing on the surface of the [[moon]] in 1971, [[David Scott]] famously repeated Galileo's experiment by dropping a feather and a hammer from each hand at the same time. In the absence of a substantial [[atmosphere]], the two objects fell and hit the moon's surface at the same time.

With his [[Newton's law of universal gravitation|law of universal gravitation]] [[Isaac Newton]] was the first to mathematically codify a correct theory of gravity. In this theory, any mass is attracted to any other mass by a force which decreases as the inverse square of their distance. In 1915, Newton's theory was replaced by [[Albert Einstein]]'s [[General relativity|general theory of relativity]]. See ''[[gravity]]'' for a much more detailed complete discussion.

== See also ==
Disputed works are marked by *, and ** marks a work generally agreed to be spurious.
* (184a) [[Physics (Aristotle)|Physics]] [http://ebooks.adelaide.edu.au/a/aristotle/physics/ (or ''Physica'')]
* (268a) [[On the Heavens]] [http://ebooks.adelaide.edu.au/a/aristotle/heavens/ (or ''De Caelo'')]
* (314a) [[On Generation and Corruption]] [http://ebooks.adelaide.edu.au/a/aristotle/corruption/ (or ''De Generatione et Corruptione'')]
* (338a) [[Meteorology (Aristotle)|Meteorology]] [http://ebooks.adelaide.edu.au/a/aristotle/meteorology/ (or ''Meteorologica'')]
* (391a) [[On the Universe]]** (or ''De Mundo'')
* (402a) [[On the Soul]] [http://ebooks.adelaide.edu.au/a/aristotle/a8so/ (or ''De Anima'')]
* The [[Parva Naturalia]] ("Little Physical Treatises"):
** (436a) [[Sense and Sensibilia (Aristotle)|Sense and Sensibilia]] [http://ebooks.adelaide.edu.au/a/aristotle/sense/ (or ''De Sensu et Sensibilibus'')]
** (449b) [[On Memory]] [http://ebooks.adelaide.edu.au/a/aristotle/memory/ (or ''De Memoria et Reminiscentia'')]
** (453b) [[On Sleep]] [http://ebooks.adelaide.edu.au/a/aristotle/sleep/ (or ''De Somno et Vigilia'')]
** (458a) [[On Dreams]] [http://ebooks.adelaide.edu.au/a/aristotle/dreams/ (or ''De Insomniis'')]
** (462b) [[On Divination in Sleep]] [http://ebooks.adelaide.edu.au/a/aristotle/prophesy/ (or ''De Divinatione per Somnum'')]
** (464b) [[On Length and Shortness of Life]] [http://ebooks.adelaide.edu.au/a/aristotle/life/ (or ''De Longitudine et Brevitate Vitae'')]
** (467b) [[On Youth, Old Age, Life and Death, and Respiration]] [http://ebooks.adelaide.edu.au/a/aristotle/youth/ (or ''De Juventute et Senectute'', ''De Vita et Morte'', ''De Respiratione'')]
* (481a) [[On Breath]]** (or ''De Spiritu'')
* (486a) [[History of Animals]] [http://etext.virginia.edu/etcbin/toccer-new2?id=AriHian.xml&images=images/modeng&data=/texts/english/modeng/parsed&tag=public&part=all (or ''Historia Animalium'')]
* (639a) [[Parts of Animals]] [http://etext.virginia.edu/etcbin/toccer-new2?id=AriPaan.xml&images=images/modeng&data=/texts/english/modeng/parsed&tag=public&part=all (or ''De Partibus Animalium'')]
* (698a) [[Movement of Animals]] [http://ebooks.adelaide.edu.au/a/aristotle/motion/ (or ''De Motu Animalium'')]
* (704a) [[Progression of Animals]] [http://historyofideas.org/etcbin/toccer-new2?id=AriGait.xml&images=images/modeng&data=/texts/english/modeng/parsed&tag=public&part=all (or ''De Incessu Animalium'')]
* (715a) [[Generation of Animals]] [http://etext.virginia.edu/etcbin/toccer-new2?id=AriGene.xml&images=images/modeng&data=/texts/english/modeng/parsed&tag=public&part=all (or ''De Generatione Animalium'')]
* (791a) [[On Colors]]** (or ''De Coloribus'')
* (800a) [[On Things Heard]]** (or ''De audibilibus'')
* (805a) [[Physiognomonics]]** (or ''Physiognomonica'')
* (815a) [[On Plants]]** (or ''De Plantis'')
* (830a) [[On Marvellous Things Heard]]** [http://www.archive.org/stream/demirabilibusaus00arisrich/demirabilibusaus00arisrich_djvu.txt (or ''De mirabilibus auscultationibus'')]
* (847a) [[Mechanics (Aristotle)|Mechanics]]** (or ''Mechanica'')
* (859a) [[Problems (Aristotle)|Problems]]* (or ''Problemata'')
* (968a) [[On Indivisible Lines]]** (or ''De Lineis Insecabilibus'')
* (973a) [[The Situations and Names of Winds]]** (or ''Ventorum Situs'')
* (974a) [[On Melissus, Xenophanes, and Gorgias]]**

== Notes ==
{{Refbegin}}
'''a''' {{Note_label|A|a|none}} The term "Earth" does not refer to planet [[Earth]], which is known by modern science to be composed of a large number of [[chemical element]]s. Modern chemical elements are not conceptually similar to Aristotle's elements. The term "Air" does not refer to the breathable [[air]]. The Earth's atmosphere is also made up of many chemical elements.
{{Refend}}

== References ==
{{Reflist|2}}
*{{Cite journal
|last=Ragep
|first=F. Jamil
|year=2001a
|title=Tusi and Copernicus: The Earth's Motion in Context
|journal=Science in Context
|volume=14
|issue=1–2
|pages=145–163
|publisher=[[Cambridge University Press]]
|ref=harv
|postscript=
}}
*{{Cite journal
|last=Ragep
|first=F. Jamil
|year=2001b
|title=Freeing Astronomy from Philosophy: An Aspect of Islamic Influence on Science
|journal=Osiris, 2nd Series
|volume=16
|issue=Science in Theistic Contexts: Cognitive Dimensions
|pages=49–64 & 66–71
|ref=harv
|postscript=
|bibcode = 2001Osir...16...49R
|doi=10.1086/649338
|last2=Al-Qushji
|first2=Ali}}
* H. Carteron (1965) "Does Aristotle Have a Mechanics?" in ''Articles on Aristotle 1. Science'' eds. Jonathan Barnes, Malcolm Schofield, Richard Sorabji (London: General Duckworth and Company Limited), 161-174.

==Further reading==
* Katalin Martinás, “Aristotelian Thermodynamics,” ''Thermodynamics: history and philosophy: facts, trends, debates'' (Veszprém, Hungary 23–28 July 1990), 285-303.

{{DEFAULTSORT:Aristotelian Physics (History of Science)}}
[[Category:Aristotle|Physics]]
[[Category:History of physics]]
[[Category:Natural philosophy]]

[[pt:Teoria aristotélica da gravitação]]

Tensor product

2013-07-13T15:15:46Z

Magmalex: /* Prerequisite: the free vector space */ changed: x in F to x in K

{{mergefrom|Dyadic product|Outer product|date=August 2012|discuss=Wikipedia talk:WikiProject Mathematics#Suggested merges with dyadic product and outer product, into tensor product...}}

In [[mathematics]], the '''tensor product''', denoted by ⊗, may be applied in different contexts to [[vector space|vectors]], [[matrix (mathematics)|matrices]], [[tensors]], [[vector spaces]], [[algebra over a field|algebras]], [[topological vector spaces]], and [[module (mathematics)|modules]], among many other structures or objects. In each case the significance of the symbol is the same: the most general [[bilinear operator|bilinear operation]]. In some contexts, this product is also referred to as '''[[outer product]]'''. The term "tensor product" is also used in relation to [[monoidal category|monoidal categories]]. The <math>\boxtimes</math> variant of ⊗ is used in control theory that expresses that the elements in the tensor product are vectors, matrices or tensors which define the vertexes of a given polytopic model, as in [[TP model transformation]].

==Tensor product of vector spaces==
The tensor product of two [[vector space]]s ''V'' and ''W'' over a [[field (mathematics)|field]] ''K'' is another vector space over ''K''. It is denoted ''V'' ⊗''K'' ''W'' or ''V'' ⊗ ''W'' when the underlying field ''K'' is understood.

===Prerequisite: the free vector space===
The construction of ''V'' ⊗ ''W'' requires the notion of the [[free vector space]] ''F''(''S'') on some [[set (mathematics)|set]] ''S''. The elements of the vector space ''F''(''S'') are expressions of the form
:<math>a_1 \cdot s_1 + a_2 \cdot s_2 + \dots + a_n \cdot s_n.</math>
Here the coefficients <math>a_1, \dots, a_n</math> are elements of the ground field ''K'' and the <math>s_1, \dots, s_n</math> are arbitrary elements of ''S''. The plus symbol and the dots are purely formal notation. Addition of such formal linear sums does ''not'' mean that elements of ''S'' are added, nor is any element of ''K'' actually multiplied with one of ''S''. Instead, for example
:<math>(a_1 \cdot s_1 + \dots + a_n \cdot s_n) + (a'_1 \cdot s_1 + b' \cdot s') = (a_1 + a'_1) \cdot s_1 + a_2\cdot s_2 + \dots a_n \cdot s_n + b' \cdot s',</math>
if s' is different from all the elements appearing in the first summand. Moreover, the product ([[scalar multiplication]]) of the above formal linear sum with some element ''x'' in ''K'' is defined as
:<math>(x a_1) \cdot s_1 + (x a_2) \cdot s_2 + \dots + (x a_n) \cdot s_n.</math>
This concludes the definition of the vector space ''F''(''S''). For example, if ''S'' has just 3 elements, then ''F''(''S'') is a 3-[[dimension (vector space)|dimensional]] vector space.

===Definition===
Given two vector spaces ''V'' and ''W'', the [[Cartesian product]] ''V'' × ''W'' is the set consisting of pairs (''v'', ''w'') with ''v'' in ''V'' and ''w'' in ''W''. (This Cartesian product is also a vector space in its own right, but is regarded as a set, only, at this point.) The tensor product is defined as a certain [[quotient space (linear algebra)|quotient vector space]] of ''F''(''V'' × ''W''), the free ''K''-vector space on the Cartesian product.

The free vector space is called like this since different elements of the set ''V'' x ''W'' are not at all related in ''F''(''V'' × ''W''). For example, given two different elements <math>v_1, v_2 \in V</math>, in the free vector space ''F''(''V'' x ''W''), the equation
:<math>(v_1, w_1) + (v_2, w_1) = (v_1 + v_2, w_1)</math>
does ''not'' hold. Likewise, for ''x'' in ''K'', the equation
:<math>x (v_1, w_1) = (x v_1, w_1)</math>
does ''not'' hold in the free vector space. In this sense, even though ''V'' × ''W'' is a vector space itself, that structure is not reflected in ''F''(''V'' × ''W''). The idea of the tensor product is to enforce these two relations and similar ones for the second variable. To do so, consider the subspace ''R'' of ''F''(''V'' × ''W'') generated by the following elements. To simplify notation, it is customary to drop the coefficient if it is one, i.e., <math>(v, w)</math> stands for <math>1 \cdot (v, w) \in F(V \times W)</math>.

<math>\begin{align}
(v_1, w) + (v_2, w) - (v_1 + v_2, w)\\
(v, w_1) + (v, w_2) - (v, w_1 + w_2)\\
c \cdot (v, w) - (cv, w) \\
c \cdot (v, w) - (v, cw),
\end{align}</math>

where ''v'', ''v''1 and ''v''2 are arbitrary elements of ''V'', while ''w'', ''w''1, and ''w''2 are vectors from ''W'', and ''c'' is from the underlying field ''K''.

The tensor product is defined as the vector space
:<math>V \otimes W := F(V \times W) / R.</math>
The ''tensor product of two vectors'' ''v'' and ''w'' is the [[equivalence class]] ((''v'',''w'') + ''R'') of (''v'',''w'') in ''V'' ⊗ ''W''. It is denoted ''v'' ⊗ ''w''. The effect of dividing out ''R'' in the free vector space is the following equations hold in ''V'' ⊗ ''W'':
:<math>\begin{align}
(v_1 + v_2) \otimes w &= v_1 \otimes w + v_2 \otimes w;\\
v \otimes (w_1 + w_2) &= v \otimes w_1 + v \otimes w_2;\\
cv \otimes w &= v \otimes cw = c(v \otimes w).
\end{align}</math>

===Notation and examples===
Given bases {''vi''} and {''wi''} for ''V'' and ''W'' respectively, the tensors {''vi'' ⊗ ''wj''} form a basis for ''V'' ⊗ ''W''. The dimension of the tensor product therefore is the product of dimensions of the original spaces; for instance '''R'''''m'' ⊗ '''R'''''n'' will have dimension ''mn''.

Elements of ''V'' ⊗ ''W'' are sometimes referred to as ''tensors'', although this term refers to many other related concepts as well.<ref>See [[Tensor]] or [[Tensor (intrinsic definition)]].</ref> An element of ''V'' ⊗ ''W'' of the form ''v'' ⊗ ''w'' is called a ''pure'' or ''[[simple tensor]]''. In general, an element of the tensor product space is not a pure tensor, but rather a finite linear combination of pure tensors. That is to say, if ''v''1 and ''v''2 are [[linearly independent]], and ''w''1 and ''w''2 are also linearly independent, then ''v''1 ⊗ ''w''1 + ''v''2 ⊗ ''w''2 cannot be written as a pure tensor. The number of simple tensors required to express an element of a tensor product is called the [[tensor rank]] (not to be confused with [[tensor order]], which is the number of spaces one has taken the product of, in this case 2; in notation, the number of indices), and for linear operators or matrices, thought of as (1,1) tensors (elements of the space ''V'' ⊗ ''V''*), it agrees with [[matrix rank]].

===Tensor product of linear maps===
The tensor product also operates on [[linear map]]s between vector spaces. Specifically, given two linear maps ''S'' : ''V'' → ''X'' and ''T'' : ''W'' → ''Y'' between vector spaces, the ''tensor product of the two linear maps'' ''S'' and ''T'' is a linear map
:<math>S\otimes T:V\otimes W\rightarrow X\otimes Y</math>

defined by
:<math>(S\otimes T)(v\otimes w)=S(v)\otimes T(w).</math>

In this way, the tensor product becomes a [[bifunctor]] from the category of vector spaces to itself, [[Functor#Covariance and contravariance|covariant]] in both arguments.<ref>
{{cite book|last1=Hazewinkel|first1=Michiel|last2=Gubareni|first2=Nadezhda Mikhaĭlovna|
last3=Gubareni|first3=Nadiya|last4=Kirichenko|first4=Vladimir V.|
title=Algebras, rings and modules|page=100|
publisher=Springer|year=2004|isbn=978-1-4020-2690-4}}</ref>

By choosing bases of all vector spaces involved, the linear maps ''S'' and ''T'' can be represented by [[matrix (mathematics)|matrices]]. Then, the matrix describing the tensor product <math>S \otimes T</math> is the [[Kronecker product]] of the two matrices. For example, if ''V'', ''X'', ''W'', and ''Y'' are all two-dimensional above and bases have been fixed for all of them, and ''S'' and ''T'' are given by the matrices <math>\begin{bmatrix}
a_{1,1} & a_{1,2} \\
a_{2,1} & a_{2,2} \\
\end{bmatrix}</math> and <math>\begin{bmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2} \\
\end{bmatrix}</math>, respectively, then the tensor product of these two matrices is
:<math>
\begin{bmatrix}
a_{1,1} & a_{1,2} \\
a_{2,1} & a_{2,2} \\
\end{bmatrix}
\otimes
\begin{bmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2} \\
\end{bmatrix}
=
\begin{bmatrix}
a_{1,1} \begin{bmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2} \\
\end{bmatrix} & a_{1,2} \begin{bmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2} \\
\end{bmatrix} \\
& \\
a_{2,1} \begin{bmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2} \\
\end{bmatrix} & a_{2,2} \begin{bmatrix}
b_{1,1} & b_{1,2} \\
b_{2,1} & b_{2,2} \\
\end{bmatrix} \\
\end{bmatrix}
=
\begin{bmatrix}
a_{1,1} b_{1,1} & a_{1,1} b_{1,2} & a_{1,2} b_{1,1} & a_{1,2} b_{1,2} \\
a_{1,1} b_{2,1} & a_{1,1} b_{2,2} & a_{1,2} b_{2,1} & a_{1,2} b_{2,2} \\
a_{2,1} b_{1,1} & a_{2,1} b_{1,2} & a_{2,2} b_{1,1} & a_{2,2} b_{1,2} \\
a_{2,1} b_{2,1} & a_{2,1} b_{2,2} & a_{2,2} b_{2,1} & a_{2,2} b_{2,2} \\
\end{bmatrix}.
</math>

The resultant rank is at most 4, and the resultant dimension 16. Here rank denotes the [[tensor rank]] (number of requisite indices), while the [[matrix rank]] counts the number of degrees of freedom in the resulting array.

A [[dyadic product]] is the special case of the tensor product between two vectors of the same dimension.

===Universal property===
The tensor product as defined above has a [[universal property]]. In general, a universal property means that some mathematical object is characterized by the maps with target (or, alternatively, domain) this object. In the context of [[linear algebra]] and vector spaces, the maps in question are required to be [[linear map]]s. The tensor product of vector spaces, as defined above, satisfies the following universal property: there is a [[bilinear]] (i.e., linear in each variable ''v'' and ''w'') map
<math>\varphi : V\times W\to V \otimes W</math>
such that given ''any'' other vector space ''Z'' together with a bilinear map <math>h:V\times W\to Z</math>, there is a unique linear map <math>\tilde{h}:V\otimes W\to Z</math> verifying <math> h=\tilde{h}\circ \varphi</math>.
In this sense, <math>\varphi</math> is the most general bilinear map that can be built from <math>V \times W </math>.

This characterization can simplify proving statements about the tensor product. For example, the tensor product is symmetric: that is, there is a [[canonical]] isomorphism:
:<math>V \otimes W \cong W \otimes V.</math>
To construct, say, a map from left to right, it suffices, by the universal property, to give a bilinear map
<math>V \times W \to W \otimes V.</math>
This is done by mapping (''v'', ''w'') to <math>w \otimes v</math>. Constructing a map in the opposite direction is done similarly, as is checking that the two linear maps <math>V \otimes W \to W \otimes V</math> and <math>W \otimes V \to V \otimes W</math> are inverse to one another.

A similar reasoning can be used to show that the tensor product is associative, that is, there are natural isomorphisms
:<math>V_1\otimes(V_2\otimes V_3)\cong (V_1\otimes V_2)\otimes V_3.</math>
Therefore, it is customary to omit the parentheses and write <math>V_1\otimes V_2\otimes V_3</math>.

===Tensor powers and braiding===
Let ''n'' be a non-negative integer. The ''n''th '''tensor power''' of the vector space ''V'' is the ''n''-fold tensor product of ''V'' with itself. That is
:<math>V^{\otimes n} \;\overset{\mathrm{def}}{=}\; \underbrace{V\otimes\cdots\otimes V}_{n}.</math>

A [[permutation]] σ of the set {1, 2, ..., ''n''} determines a mapping of the ''n''th Cartesian power of ''V''
:<math>\sigma : V^n\to V^n</math>

defined by
:<math>\sigma(v_1,v_2,\dots,v_n) = (v_{\sigma 1}, v_{\sigma 2},\dots,v_{\sigma n}).</math>

Let
:<math>\varphi:V^n \to V^{\otimes n}</math>

be the natural multilinear embedding of the Cartesian power of ''V'' into the tensor power of ''V''. Then, by the universal property, there is a unique isomorphism
:<math>\tau_\sigma : V^{\otimes n} \to V^{\otimes n}</math>

such that
:<math>\varphi\circ\sigma = \tau_\sigma\circ\varphi.</math>

The isomorphism τσ is called the '''braiding map''' associated to the permutation σ.

==Product of tensors==
{{See also|Classical treatment of tensors}}
For non-negative integers ''r'' and ''s'' a ''r'',''s''-[[tensor]] on a vector space ''V'' is an element of
:<math> T^r_s(V) = \underbrace{ V\otimes \dots \otimes V}_{r} \otimes \underbrace{ V^*\otimes \dots \otimes V^*}_{s} = V^{\otimes r}\otimes V^{*\otimes s}.</math>
Here <math>V^*</math> is the [[dual vector space]] (which consists of all [[linear map]]s ''f'' from ''V'' to the ground field ''K'').

There is a product map, called the ''(tensor) product of tensors''
:<math>T^r_s (V) \otimes_K T^{r'}_{s'} (V) \to T^{r+r'}_{s+s'}(V).</math>
It is defined by grouping all occuring "factors" ''V'' together: writing <math>v_i</math> for an element of ''V'' and <math>f_i</math> for elements of the dual space,
:<math>(v_1 \otimes f_1) \otimes (v'_1) = v_1 \otimes v'_1 \otimes f_1.</math>

Picking a basis of ''V'' and the corresponding [[dual basis]] of <math>V^*</math>, <math>T^r_s(V)</math> is endowed with a natural basis (this basis is described in the [[Kronecker_product#Relation_to_the_abstract_tensor_product|article on Kronecker products]]). In terms of these bases, the [[Coordinate vector|components]] of a (tensor) product of two (or more) [[tensor]]s can be computed. For example, if ''F'' and ''G'' are two [[covariance and contravariance of vectors|covariant]] tensors of rank ''m'' and ''n'' (respectively) (i.e. ''F'' ∈ ''T''m0, and ''G'' ∈ ''T''n0), then the components of their tensor product are given by

:<math>(F\otimes G)_{i_1i_2...i_{m+n}} = F_{i_{1}i_{2}...i_{m}}G_{i_{m+1}i_{m+2}i_{m+3}...i_{m+n}}.</math>
<ref>Analogous formulas also hold for [[covariance and contravariance of vectors|contravariant]] tensors, as well as tensors of mixed variance. Although in many cases such as when there is an [[inner product]] defined, the distinction is irrelevant.</ref>
Thus, the components of the tensor product of two tensors are the ordinary product of the components of each tensor. Another example: let '''U''' be a tensor of type (1,1) with components ''Uαβ'', and let '''V''' be a tensor of type (1,0) with components ''Vγ''. Then
:<math> U^\alpha {}_\beta V^\gamma = (U \otimes V)^\alpha {}_\beta {}^\gamma </math>

and
:<math> V^\mu U^\nu {}_\sigma = (V \otimes U)^{\mu \nu} {}_\sigma. </math>

==Relation to dual space==
A particular example is the tensor product of some vector space ''V'' with its [[dual vector space]] <math>V^*</math> (which consists of all [[linear map]]s ''f'' from ''V'' to the ground field ''K''). In this case, there is a natural "evaluation" map
:<math>V \otimes V^* \to K</math>
which on elementary tensors is defined by
:<math>v \otimes f \mapsto f(v).</math>
The resulting map
:<math>T^r_s (V) \to T^{r-1}_{s-1}(V)</math>
is called [[tensor contraction]] (for ''r'', ''s'' > 0).

On the other hand, if ''V'' is ''finite-dimensional'', there is a map in the other direction (called [[coevaluation]])
:<math>K \to V \otimes V^*, \lambda \mapsto \sum_i \lambda v_i \otimes v^*_i.</math>
where <math>v_1, \dots, v_n</math> is a basis of ''V'', and <math>v^*_i</math> is its dual basis. The interplay of evaluation and coevaluation map can be used to characterize finite-dimensional vector spaces without referring to bases.<ref>See [[Compact closed category]].</ref>

===Tensor product vs. Hom===
Given three vector spaces ''U'', ''V'', ''W'' the tensor product is linked to the vector space of ''all'' linear maps, as follows:
:<math>Hom (U \otimes V, W) \cong Hom (U, Hom(V, W)).</math>
Here <math>Hom (-,-)</math> denotes the ''K''-vector space of all linear maps. This is an example of [[adjoint functor]]s: the tensor product is "left adjoint" to Hom.

===Adjoint representation===
The tensor <math> \scriptstyle T^r_s(V) </math> may be naturally viewed as a module for the [[Lie algebra]] End(''V'') by means of the diagonal action: for simplicity let us assume ''r'' = ''s'' = 1, then, for each <math>\scriptstyle u \in\mathrm{End}(V) </math>,
:<math> u(a \otimes b) = u(a) \otimes b - a \otimes u^*(b),</math>

where ''u''* in End(''V''*) is the [[transpose]] of ''u'', that is, in terms of the obvious pairing on ''V'' ⊗ ''V''*,
:<math>\langle u(a), b \rangle = \langle a, u^*(b) \rangle</math>.

There is a canonical isomorphism <math>\scriptstyle T^1_1(V) \rightarrow \mathrm{End}(V) </math> given by
:<math>(a \otimes b)(x) = \langle x, b \rangle a. </math>

Under this isomorphism, every ''u'' in End(''V'') may be first viewed as an endomorphism of <math>\scriptstyle T^1_1(V)</math> and then viewed as an endomorphism of End(''V''). In fact it is the [[Adjoint representation of a Lie algebra|adjoint representation]] ad(''u'') of End(''V'') .

==Tensor products of modules over a ring ==
{{main|Tensor product of modules}}
The tensor product of two [[module (mathematics)|modules]] ''A'' and ''B'' over a ''[[commutative ring|commutative]]'' [[ring (mathematics)|ring]] <math> R </math> is defined in the exact same way as the tensor product of vector spaces over a field:
:<math>M \otimes_R N := F (A \times B) / G</math>
where now <math>F (A \times B)</math> is the [[free module|free ''R''-module]] generated by the cartesian product and ''G'' is the ''R''-module generated by the same relations as above.

More generally, the tensor product can be defined even if the ring is non-commutative (''ab ≠ ba''). In this case ''M'' has to be a ''right''-''R''-module and ''N'' is a left-''R''-module, and instead of the last two relations above, the relation
:<math>(ar,b)-(a,rb)</math>
is imposed. If ''R'' is non-commutative, this is no longer an ''R''-module, but just an [[abelian group]].

The universal property also carries over, slightly modified: the map <math> \phi: A \times B \to A \otimes_R B </math> defined by <math> (a,b) \mapsto a \otimes b </math> is a [[middle linear map]] (referred to as "canonical Middle Linear Map".<ref>
{{cite book|last=Hungerford|first=Thomas W.|title=Algebra|
publisher=Springer|year=1974|isbn=0-387-90518-9}}</ref>); that is,<ref name=chen>
{{citation|last=Chen|first=Jungkai Alfred|title=Advanced Algebra II|chapter=Tensor product|
chapter-url=http://www.math.ntu.edu.tw/~jkchen/S04AA/S04AAL10.pdf|
type=lecture notes|year=2004|month=spring|place=National Taiwan University}}</ref> it satisfies:

:<math> \begin{align}
\phi(a+a',b)=\phi(a,b)+\phi(a',b) \\
\phi(a,b+b')=\phi(a,b)+\phi(a,b') \\
\phi(ar,b)=\phi(a,rb)
\end{align} </math>

The first two properties make <math> \phi </math> a homomorphism of the [[abelian group]] <math> A \times B </math>. For any [[middle linear map]] <math> \psi </math> of <math> A \times B </math>, a unique group homomorphism <math> f </math> of <math> A \otimes_R B </math> satisfies <math> \psi = f \circ \phi </math>, and this property determines <math> \phi </math> within group isomorphism. See the [[tensor product of modules|main article]] for details.

===Computing the tensor product===
For vector spaces, the tensor product <math>V \otimes W</math> is quickly computed since bases of ''V'' of ''W'' immediately determine a basis of <math>V \otimes W</math>, as was mentioned above. For modules over a general (commutative) ring, not every module is free. For example, '''Z'''/''n'' is not a free abelian group (='''Z'''-module). The tensor product with '''Z'''/''n'' is given by
:<math>M \otimes_\mathbf Z \mathbf Z/n = M/n.</math>
More generally, given a [[presentation]] of some ''R''-module ''M'', that is, a number of generators <math>m_i \in M, i \in I</math> together with relations <math>\sum_{j \in J} a_{ji} m_i = 0</math>, with <math>a_{ji} \in R</math>, the tensor product can be computed as the following [[cokernel]]:
:<math>M \otimes_R N = \operatorname{coker} (N^J \rightarrow N^I)</math>
Here <math>N^J := \oplus_{j \in J} N</math> and the map is determined by sending some <math>n \in N</math> in the ''j''-th copy of <math>N^J</math> to <math>a_{ji} n</math> (in <math>N^I</math>). Colloquially, this may be rephrased by saying that a presentation of ''M'' gives rise to a presentation of <math>M \otimes_R N</math>. This is referred to by saying that the tensor product is a [[right exact functor]]. It is not in general left exact, that is, given an injective map of ''R''-modules <math>M_1 \to M_2</math>, the tensor product
:<math>M_1 \otimes_R N \to M_2 \otimes_R N</math>
is not usually injective. For example, tensoring the (injective) map given by multiplication with ''n'', <math>n: \mathbf Z \to \mathbf Z</math> with <math>\mathbf Z/n</math> yields the 0 map <math>0 : \mathbf Z/n \to \mathbf Z/n</math> which is not injective. Higher [[Tor functor]]s measure the defect of the tensor product being not left exact.

==Tensor product of algebras==
{{main|Tensor product of algebras}}
Let ''R'' be a commutative ring. The tensor product of ''R''-modules applies, in particular, if ''A'' and ''B'' are [[algebra|''R''-algebras]]. In this case, the tensor product <math>A \otimes_R B</math> is an ''R''-algebra itself by putting
:<math>(a_1 \otimes b_1) \cdot (a_2 \otimes b_2) = (a_1 \cdot a_2) \otimes (b_1 \cdot b_2).</math>
For example,
:<math>R[x] \otimes_R R[y] = R[x, y].</math>

A particular example is when ''A'' and ''B'' are fields containing a common subfield ''R''. The [[tensor product of fields]] is closely related to [[Galois theory]]: if, say, <math>A = R[x] / f(x)</math>, where ''f'' is some [[irreducible polynomial]] with coefficients in ''R'', the tensor product can be calculated as
:<math>A \otimes_R B = B[x] / f(x)</math>
where now ''f'' is interpreted as the same polynomial, but with its coefficients regarded as elements of ''B''. In the larger field ''B'', the polynomial may become reducible, which brings in Galois theory. For example, if ''A'' = ''B'' is a [[Galois extension]] of ''R'', then
:<math>A \otimes_R A = A[x] / f(x)</math>
is isomorphic (as an ''A''-algebra) to the <math>A^{deg(f)}</math>.

== Other examples of tensor products ==
===Tensor product of Hilbert spaces===
{{main|Tensor product of Hilbert spaces}}

===Topological tensor product===
{{main|Topological tensor product}}

===Tensor product of graded vector spaces===
{{main|Graded vector space#Operations on graded vector spaces}}

===Tensor product of quadratic forms===
{{main|Tensor product of quadratic forms}}

===Tensor product of multilinear maps===
Given [[multilinear]] maps <math>\scriptstyle f (x_1,\dots,x_k)</math> and <math>\scriptstyle g (x_1,\dots, x_m)</math> their tensor product is the multilinear function
:<math> (f \otimes g) (x_1,\dots,x_{k+m}) = f(x_1,\dots,x_k) g(x_{k+1},\dots,x_{k+m}). </math>

===Tensor product of graphs===
{{main|Tensor product of graphs}}

==Applications==
===Exterior and symmetric algebra===
Two notable constructions in linear algebra can be constructed as quotients of the tensor product: the [[exterior algebra]] and the [[symmetric algebra]]. For example, given a vector space ''V'', the exterior product
:<math>V \wedge V</math>
is defined as
:<math>V \otimes V / (v_1 \otimes v_2 + v_2 \otimes v_1 \text{ for all } v_1, v_2 \in V).</math>
The image of <math>v_1 \otimes v_2</math> in the exterior product is usually denoted <math>v_1 \wedge v_2</math> and satisfies, by construction, <math>v_1 \wedge v_2 = - v_2 \wedge v_1</math>. Similar constructions are possible for <math>V \otimes \dots \otimes V</math> (''n'' factors), giving rise to <math>\Lambda^n V</math>, the ''n''-th [[exterior power]] of ''V''. The latter notion is the basis of [[differential form|differential ''n''-forms]].

The symmetric algebra is constructed in a similar manner:
:<math>Sym^n V := \underbrace{V \otimes \dots \otimes V}_n / (\dots \otimes v_i \otimes v_{i+1} \otimes \dots - \dots \otimes v_{i+1} \otimes v_{i} \otimes \dots)</math>
That is, in the symmetric algebra two adjacent vectors (and therefore all of them) can be interchanged. The resulting objects are called symmetric tensors.

===Tensor product of line bundles===
{{main|Vector bundle#Operations on vector bundles}}

==Tensor product for computer programmers==

===Array programming languages===
[[Array programming languages]] may have this pattern built in. For example, in [[APL programming language|APL]] the tensor product is expressed as <math>\scriptstyle\circ . \times</math> (for example <math>\scriptstyle A \circ . \times B</math> or <math>\scriptstyle A \circ . \times B \circ . \times C</math>). In [[J programming language|J]] the tensor product is the dyadic form of '''*/''' (for example '''a */ b''' or ''' a */ b */ c''').

Note that J's treatment also allows the representation of some tensor fields, as '''a''' and '''b''' may be functions instead of constants. This product of two functions is a derived function, and if '''a''' and '''b''' are [[differentiable]], then '''a*/b''' is differentiable.

However, these kinds of notation are not universally present in array languages. Other array languages may require explicit treatment of indices (for example, [[MATLAB]]), and/or may not support [[higher-order functions]] such as the [[Jacobian matrix and determinant|Jacobian derivative]] (for example, [[Fortran]]/APL).

==See also==
*[[Dyadic product]]
*[[Extension of scalars]]
*[[Multilinear subspace learning]]
*[[Tensor algebra]]
*[[Tensor contraction]]
*[[Topological tensor product]]

==Notes==
<references/>

==References==
* {{citation|first = Nicolas|last=Bourbaki|authorlink=Nicolas Bourbaki | title = Elements of mathematics, Algebra I| publisher = Springer-Verlag | year = 1989|isbn=3-540-64243-9}}.
* {{citation|authorlink=Paul Halmos|first=Paul|last=Halmos|title=Finite dimensional vector spaces|year=1974|publisher=Springer|isbn=0-387-90093-4}}.
* {{Lang Algebra|edition=3r}}
* {{citation|first1=S.|last1=Mac Lane|authorlink1=Saunders Mac Lane|authorlink2=Garrett Birkhoff|last2=Birkhoff|first2=G.|title=Algebra|publisher=AMS Chelsea|year=1999|isbn=0-8218-1646-2}}.

{{tensors}}

{{DEFAULTSORT:Tensor Product}}
[[Category:Binary operations]]
[[Category:Bilinear operators]]

Global element

2013-03-31T10:17:32Z

Magmalex: edited clique link

In [[category theory]], a '''global element''' of an object ''A'' from a [[Category (mathematics)|category]] is a morphism
: ''h'' : 1 → ''A'',
where 1 is a [[terminal object]] of the category. Roughly speaking, global elements are a generalization of the notion of “elements” from the [[category of sets]], and they can be used to import set-theoretic concepts into category theory. However, unlike a set, an object of a general category need not be determined by its global elements (not even [[up to]] [[isomorphism]]). For example the terminal object of the category '''Grph''' of graphs has one vertex and one edge, a self-loop, whence the global elements of a graph are its self-loops, conveying no information either about other kinds of edges, or about vertices having no self-loop, or about whether two self-loops share a vertex.

In an [[elementary topos]] the global elements of the [[subobject classifier]] Ω form a Heyting algebra when ordered by inclusion of the corresponding subobjects of the terminal object. For example '''Grph''' happens to be a topos, whose subobject classifier Ω is a two-vertex directed [[Clique_(graph_theory)|clique]] with an additional self-loop (so five edges, three of which are self-loops and hence the global elements of Ω). The internal logic of '''Grph''' is therefore based on the three-element Heyting algebra as its [[truth value]]s.

{{cattheory-stub}}

[[Category:Objects (category theory)]]

Global element

2013-03-31T10:15:00Z

Magmalex: edited clique link

In [[category theory]], a '''global element''' of an object ''A'' from a [[Category (mathematics)|category]] is a morphism
: ''h'' : 1 → ''A'',
where 1 is a [[terminal object]] of the category. Roughly speaking, global elements are a generalization of the notion of “elements” from the [[category of sets]], and they can be used to import set-theoretic concepts into category theory. However, unlike a set, an object of a general category need not be determined by its global elements (not even [[up to]] [[isomorphism]]). For example the terminal object of the category '''Grph''' of graphs has one vertex and one edge, a self-loop, whence the global elements of a graph are its self-loops, conveying no information either about other kinds of edges, or about vertices having no self-loop, or about whether two self-loops share a vertex.

In an [[elementary topos]] the global elements of the [[subobject classifier]] Ω form a Heyting algebra when ordered by inclusion of the corresponding subobjects of the terminal object. For example '''Grph''' happens to be a topos, whose subobject classifier Ω is a two-vertex directed [[Clique_(graph_theory)]] with an additional self-loop (so five edges, three of which are self-loops and hence the global elements of Ω). The internal logic of '''Grph''' is therefore based on the three-element Heyting algebra as its [[truth value]]s.

{{cattheory-stub}}

[[Category:Objects (category theory)]]

Identity function

2013-03-26T22:20:56Z

Magmalex: added '''identity relation''' as a synonim

{{distinguish|Null function|Empty function}}
{{Unreferenced|date=December 2009}}
In [[mathematics]], an '''identity function''', also called '''identity relation''' or '''identity map''' or '''identity transformation''', is a [[function (mathematics)|function]] that always returns the same value that was used as its argument. In terms of [[equation]]s, the function is given by ''f''(''x'') = ''x''.

==Definition==
Formally, if ''M'' is a [[Set (mathematics)|set]], the identity function ''f'' on ''M'' is defined to be that function with [[domain (mathematics)|domain]] and [[codomain]] ''M'' which satisfies
:''f''(''x'') = ''x''    for all elements ''x'' in ''M''.

In other words, the function assigns to each element ''x'' of ''M'' the element ''x'' of ''M''.

The identity function ''f'' on ''M'' is often denoted by id''M''.

In terms of [[set theory]], where a function is defined as a particular kind of [[binary relation]], the identity function is given by the [[identity relation]], or ''diagonal'' of ''M''.

==Algebraic property==
If ''f'' : ''M'' → ''N'' is any function, then we have ''f'' o id''M'' = ''f'' = id''N'' o ''f'' (where "o" denotes [[function composition]]). In particular, id''M'' is the [[identity element]] of the [[monoid]] of all functions from ''M'' to ''M''.

Since the identity element of a monoid is [[unique]], one can alternately define the identity function on ''M'' to be this identity element. Such a definition generalizes to the concept of an [[identity morphism]] in [[category theory]], where the [[endomorphism]]s of ''M'' need not be functions.

==Properties==
*The identity function is a [[linear map|linear operator]], when applied to [[vector space]]s.
*The identity function on the positive [[integer]]s is a [[completely multiplicative function]] (essentially multiplication by 1), considered in [[number theory]].
*In an ''n''-dimensional [[vector space]] the identity function is represented by the [[identity matrix]] ''I''''n'', regardless of the [[Basis (linear algebra)|basis]].
*In a [[metric space]] the identity is trivially an [[isometry]]. An object without any [[symmetry]] has as [[symmetry group]] the trivial group only containing this isometry (symmetry type ''C1).

==See also==
*[[Inclusion map]]

{{DEFAULTSORT:Identity Function}}
[[Category:Functions and mappings]]
[[Category:Elementary mathematics]]
[[Category:Basic concepts in set theory]]
[[Category:Types of functions]]
[[Category:One]]

Diagram (category theory)

2013-01-09T02:44:36Z

Magmalex: /* External links */

In [[category theory]], a branch of mathematics, a '''diagram''' is the categorical analogue of an [[indexed family]] in [[set theory]]. The primary difference is that in the categorical setting one has [[morphism]]s that also need indexing. An indexed family of sets is a collection of sets, indexed by a fixed set; equivalently, a ''function'' from a fixed index ''set'' to the class of ''sets''. A diagram is a collection of objects and morphisms, indexed by a fixed category; equivalently, a ''functor'' from a fixed index ''category'' to some ''category''.

Diagrams are central to the definition of [[limit (category theory)|limits and colimits]], and to the related notion of [[cone (category theory)|cone]]s.

==Definition==

Formally, a '''diagram''' of type ''J'' in a [[category (mathematics)|category]] ''C'' is a ([[Covariance and contravariance of functors|covariant]]) [[functor]]
:''D'' : ''J'' → ''C''
The category ''J'' is called the '''index category''' or the '''scheme''' of the diagram ''D''; the functor is sometimes called a '''''J''-shaped diagram'''.<ref>J.P. May, ''A Concise Course in Algebraic Topology'', (1999) The University of Chicago Press, ISBN 0-226-51183-9</ref> The actual objects and morphisms in ''J'' are largely irrelevant, only the way in which they are interrelated matters. The diagram ''D'' is thought of as indexing a collection of objects and morphisms in ''C'' patterned on ''J''.

Although, technically, there is no difference between an individual ''diagram'' and a ''functor'' or between a ''scheme'' and a ''category'', the change in terminology reflects a change in perspective, just as in the set theoretic case: one fixes the index category, and allows the functor (and, secondarily, the target category) to vary.

One is most often interested in the case where the scheme ''J'' is a [[small category|small]] or even [[Finite set|finite]] category. A diagram is said to be '''small''' or '''finite''' whenever ''J'' is.

A morphism of diagrams of type ''J'' in a category ''C'' is a [[natural transformation]] between functors. One can then interpret the '''category of diagrams''' of type ''J'' in ''C'' as the [[functor category]] ''C''''J'', and a diagram is then an object in this category.

==Examples==
* Given any object ''A'' in ''C'', one has the '''constant diagram''', which is the diagram that maps all objects in ''J'' to ''A'', and all morphisms of ''J'' to the identity morphism on ''A''. Notationally, one often uses an underbar to denote the constant diagram: thus, for any object <math>A</math> in ''C'', one has the constant diagram <math>\underline A</math>.

* If ''J'' is a (small) [[discrete category]], then a diagram of type ''J'' is essentially just an [[indexed family]] of objects in ''C'' (indexed by ''J''). When used in the construction of the [[limit (category theory)|limit]], the result is the [[product (category theory)|product]]; for the colimit, one gets the [[coproduct]]. So, for example, when ''J'' is the discrete category with two objects, the resulting limit is just the binary product.

* If ''J'' = -1 ← 0 → +1, then a diagram of type ''J'' (''A'' ← ''B'' → ''C'') is a [[span (category theory)|span]], and its colimit is a [[Pushout (category theory)|pushout]]. If one were to "forget" that the diagram had object ''B'' and the two arrows ''B'' → ''A'', ''B'' → ''C'', the resulting diagram would simply be the discrete category with the two objects ''A'' and ''C'', and the colimit would simply be the binary coproduct. Thus, this example shows an important way in which the idea of the diagram generalizes that of the [[index set]] in set theory: by including the morphisms ''B'' → ''A'', ''B'' → ''C'', one discovers additional structure in constructions built from the diagram, structure that would not be evident if one only had an index set with no relations between the objects in the index.

* If ''J'' = -1 → 0 ← +1, then a diagram of type ''J'' (''A'' → ''B'' ← ''C'') is a [[cospan]], and its limit is a [[Pullback (category theory)|pullback]].

* The index <math>J = 0 \overrightarrow{\to} 1</math> is called "two parallel morphisms", or sometimes the [[free quiver]] or the [[walking quiver]]. A diagram of type ''J'' (<math>f,g\colon X \to Y</math>) is then a [[quiver (mathematics)|quiver]]; its limit is an [[Equaliser (mathematics)|equalizer]], and its colimit is a [[coequalizer]].

* If ''J'' is a [[poset category]], then a diagram of type ''J'' is a family of objects ''D''''i'' together with a unique morphism ''f''''ij'' : ''D''''i'' → ''D''''j'' whenever ''i'' ≤ ''j''. If ''J'' is [[directed set|directed]] then a diagram of type ''J'' is called a [[direct system (mathematics)|direct system]] of objects and morphisms. If the diagram is [[contravariant functor|contravariant]] then it is called an [[inverse system]].

==Cones and limits==

A [[cone (category theory)|cone]] with vertex ''N'' of a diagram ''D'' : ''J'' → ''C'' is a morphism from the constant diagram Δ(''N'') to ''D''. The constant diagram is the diagram which sends every object of ''J'' to an object ''N'' of ''C'' and every morphism to the identity morphism on ''N''.

The [[limit (category theory)|limit]] of a diagram ''D'' is a [[universal cone]] to ''D''. That is, a cone through which all other cones uniquely factor. If the limit exists in a category ''C'' for all diagrams of type ''J'' one obtains a functor
:lim : ''C''''J'' → ''C''
which sends each diagram to its limit.

Dually, the [[colimit]] of diagram ''D'' is a universal cone from ''D''. If the colimit exists for all diagrams of type ''J'' one has a functor
:colim : ''C''''J'' → ''C''
which sends each diagram to its colimit.

== Commutative diagrams ==
{{main|Commutative diagram}}

Diagrams and functor categories are often visualized by [[commutative diagrams]], particularly if the index category is a finite [[poset category]] with few elements: one draws a commutative diagram with a node for every object in the index category, and an arrow for a generating set of morphisms, omitting identity maps and morphisms that can be expressed as compositions. The commutativity corresponds to the uniqueness of a map between two objects in a poset category. Conversely, every commutative diagram represents a diagram (a functor from a poset index category) in this way.

Not every diagram commutes, as not every index category is a poset category:
most simply, the diagram of a single object with an endomorphism (<math>f\colon X \to X</math>), or with two parallel arrows (<math>\bullet \overrightarrow{\to} \bullet</math>; <math>f,g\colon X \to Y</math>) need not commute. Further, diagrams may be impossible to draw (because infinite) or simply messy (because too many objects or morphisms); however, schematic commutative diagrams (for subcategories of the index category, or with ellipses, such as for a directed system) are used to clarify such complex diagrams.

== See also ==
* [[Direct system (mathematics)|Direct system]]
* [[Inverse system]]

==References==
{{reflist}}
*{{cite book | last = Adámek | first = Jiří | coauthors = Horst Herrlich, and George E. Strecker | year = 1990 | url = http://katmat.math.uni-bremen.de/acc/acc.pdf | title = Abstract and Concrete Categories | publisher = John Wiley & Sons | isbn = 0-471-60922-6}} Now available as free on-line edition (4.2MB PDF).
* {{Cite book| last1=Barr| first1=Michael|authorlink1=Michael Barr (mathematician) | last2=Wells| first2=Charles| authorlink2=Charles Wells (mathematician) |year=2002| title=Toposes, Triples and Theories|url=http://www.tac.mta.ca/tac/reprints/articles/12/tr12.pdf|isbn=0-387-96115-1}} Revised and corrected free online version of ''Grundlehren der mathematischen Wissenschaften (278)'' Springer-Verlag, 1983).
* {{nlab|id=diagram}}

== External links ==
* [http://mathworld.wolfram.com/DiagramChasing.html Diagram Chasing] at [[MathWorld]]
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, commutative diagrams, categories, [[functor]]s, [[natural transformation]]s.
[[Category:Functors]]

[[ko:그림 (범주론)]]
[[nl:Diagram (categorietheorie)]]
[[pl:Diagram (teoria kategorii)]]

Diagram (category theory)

2013-01-09T02:42:40Z

Magmalex: /* External links */

In [[category theory]], a branch of mathematics, a '''diagram''' is the categorical analogue of an [[indexed family]] in [[set theory]]. The primary difference is that in the categorical setting one has [[morphism]]s that also need indexing. An indexed family of sets is a collection of sets, indexed by a fixed set; equivalently, a ''function'' from a fixed index ''set'' to the class of ''sets''. A diagram is a collection of objects and morphisms, indexed by a fixed category; equivalently, a ''functor'' from a fixed index ''category'' to some ''category''.

Diagrams are central to the definition of [[limit (category theory)|limits and colimits]], and to the related notion of [[cone (category theory)|cone]]s.

==Definition==

Formally, a '''diagram''' of type ''J'' in a [[category (mathematics)|category]] ''C'' is a ([[Covariance and contravariance of functors|covariant]]) [[functor]]
:''D'' : ''J'' → ''C''
The category ''J'' is called the '''index category''' or the '''scheme''' of the diagram ''D''; the functor is sometimes called a '''''J''-shaped diagram'''.<ref>J.P. May, ''A Concise Course in Algebraic Topology'', (1999) The University of Chicago Press, ISBN 0-226-51183-9</ref> The actual objects and morphisms in ''J'' are largely irrelevant, only the way in which they are interrelated matters. The diagram ''D'' is thought of as indexing a collection of objects and morphisms in ''C'' patterned on ''J''.

Although, technically, there is no difference between an individual ''diagram'' and a ''functor'' or between a ''scheme'' and a ''category'', the change in terminology reflects a change in perspective, just as in the set theoretic case: one fixes the index category, and allows the functor (and, secondarily, the target category) to vary.

One is most often interested in the case where the scheme ''J'' is a [[small category|small]] or even [[Finite set|finite]] category. A diagram is said to be '''small''' or '''finite''' whenever ''J'' is.

A morphism of diagrams of type ''J'' in a category ''C'' is a [[natural transformation]] between functors. One can then interpret the '''category of diagrams''' of type ''J'' in ''C'' as the [[functor category]] ''C''''J'', and a diagram is then an object in this category.

==Examples==
* Given any object ''A'' in ''C'', one has the '''constant diagram''', which is the diagram that maps all objects in ''J'' to ''A'', and all morphisms of ''J'' to the identity morphism on ''A''. Notationally, one often uses an underbar to denote the constant diagram: thus, for any object <math>A</math> in ''C'', one has the constant diagram <math>\underline A</math>.

* If ''J'' is a (small) [[discrete category]], then a diagram of type ''J'' is essentially just an [[indexed family]] of objects in ''C'' (indexed by ''J''). When used in the construction of the [[limit (category theory)|limit]], the result is the [[product (category theory)|product]]; for the colimit, one gets the [[coproduct]]. So, for example, when ''J'' is the discrete category with two objects, the resulting limit is just the binary product.

* If ''J'' = -1 ← 0 → +1, then a diagram of type ''J'' (''A'' ← ''B'' → ''C'') is a [[span (category theory)|span]], and its colimit is a [[Pushout (category theory)|pushout]]. If one were to "forget" that the diagram had object ''B'' and the two arrows ''B'' → ''A'', ''B'' → ''C'', the resulting diagram would simply be the discrete category with the two objects ''A'' and ''C'', and the colimit would simply be the binary coproduct. Thus, this example shows an important way in which the idea of the diagram generalizes that of the [[index set]] in set theory: by including the morphisms ''B'' → ''A'', ''B'' → ''C'', one discovers additional structure in constructions built from the diagram, structure that would not be evident if one only had an index set with no relations between the objects in the index.

* If ''J'' = -1 → 0 ← +1, then a diagram of type ''J'' (''A'' → ''B'' ← ''C'') is a [[cospan]], and its limit is a [[Pullback (category theory)|pullback]].

* The index <math>J = 0 \overrightarrow{\to} 1</math> is called "two parallel morphisms", or sometimes the [[free quiver]] or the [[walking quiver]]. A diagram of type ''J'' (<math>f,g\colon X \to Y</math>) is then a [[quiver (mathematics)|quiver]]; its limit is an [[Equaliser (mathematics)|equalizer]], and its colimit is a [[coequalizer]].

* If ''J'' is a [[poset category]], then a diagram of type ''J'' is a family of objects ''D''''i'' together with a unique morphism ''f''''ij'' : ''D''''i'' → ''D''''j'' whenever ''i'' ≤ ''j''. If ''J'' is [[directed set|directed]] then a diagram of type ''J'' is called a [[direct system (mathematics)|direct system]] of objects and morphisms. If the diagram is [[contravariant functor|contravariant]] then it is called an [[inverse system]].

==Cones and limits==

A [[cone (category theory)|cone]] with vertex ''N'' of a diagram ''D'' : ''J'' → ''C'' is a morphism from the constant diagram Δ(''N'') to ''D''. The constant diagram is the diagram which sends every object of ''J'' to an object ''N'' of ''C'' and every morphism to the identity morphism on ''N''.

The [[limit (category theory)|limit]] of a diagram ''D'' is a [[universal cone]] to ''D''. That is, a cone through which all other cones uniquely factor. If the limit exists in a category ''C'' for all diagrams of type ''J'' one obtains a functor
:lim : ''C''''J'' → ''C''
which sends each diagram to its limit.

Dually, the [[colimit]] of diagram ''D'' is a universal cone from ''D''. If the colimit exists for all diagrams of type ''J'' one has a functor
:colim : ''C''''J'' → ''C''
which sends each diagram to its colimit.

== Commutative diagrams ==
{{main|Commutative diagram}}

Diagrams and functor categories are often visualized by [[commutative diagrams]], particularly if the index category is a finite [[poset category]] with few elements: one draws a commutative diagram with a node for every object in the index category, and an arrow for a generating set of morphisms, omitting identity maps and morphisms that can be expressed as compositions. The commutativity corresponds to the uniqueness of a map between two objects in a poset category. Conversely, every commutative diagram represents a diagram (a functor from a poset index category) in this way.

Not every diagram commutes, as not every index category is a poset category:
most simply, the diagram of a single object with an endomorphism (<math>f\colon X \to X</math>), or with two parallel arrows (<math>\bullet \overrightarrow{\to} \bullet</math>; <math>f,g\colon X \to Y</math>) need not commute. Further, diagrams may be impossible to draw (because infinite) or simply messy (because too many objects or morphisms); however, schematic commutative diagrams (for subcategories of the index category, or with ellipses, such as for a directed system) are used to clarify such complex diagrams.

== See also ==
* [[Direct system (mathematics)|Direct system]]
* [[Inverse system]]

==References==
{{reflist}}
*{{cite book | last = Adámek | first = Jiří | coauthors = Horst Herrlich, and George E. Strecker | year = 1990 | url = http://katmat.math.uni-bremen.de/acc/acc.pdf | title = Abstract and Concrete Categories | publisher = John Wiley & Sons | isbn = 0-471-60922-6}} Now available as free on-line edition (4.2MB PDF).
* {{Cite book| last1=Barr| first1=Michael|authorlink1=Michael Barr (mathematician) | last2=Wells| first2=Charles| authorlink2=Charles Wells (mathematician) |year=2002| title=Toposes, Triples and Theories|url=http://www.tac.mta.ca/tac/reprints/articles/12/tr12.pdf|isbn=0-387-96115-1}} Revised and corrected free online version of ''Grundlehren der mathematischen Wissenschaften (278)'' Springer-Verlag, 1983).
* {{nlab|id=diagram}}

== External links ==
* [http://mathworld.wolfram.com/DiagramChasing.html Diagram Chasing] at [[MathWorld]]
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, Commutative diagrams, categories, [[functor]]s, [[natural transformation]]s.
[[Category:Functors]]

[[ko:그림 (범주론)]]
[[nl:Diagram (categorietheorie)]]
[[pl:Diagram (teoria kategorii)]]

Diagram (category theory)

2013-01-09T02:40:49Z

Magmalex: Added external links

In [[category theory]], a branch of mathematics, a '''diagram''' is the categorical analogue of an [[indexed family]] in [[set theory]]. The primary difference is that in the categorical setting one has [[morphism]]s that also need indexing. An indexed family of sets is a collection of sets, indexed by a fixed set; equivalently, a ''function'' from a fixed index ''set'' to the class of ''sets''. A diagram is a collection of objects and morphisms, indexed by a fixed category; equivalently, a ''functor'' from a fixed index ''category'' to some ''category''.

Diagrams are central to the definition of [[limit (category theory)|limits and colimits]], and to the related notion of [[cone (category theory)|cone]]s.

==Definition==

Formally, a '''diagram''' of type ''J'' in a [[category (mathematics)|category]] ''C'' is a ([[Covariance and contravariance of functors|covariant]]) [[functor]]
:''D'' : ''J'' → ''C''
The category ''J'' is called the '''index category''' or the '''scheme''' of the diagram ''D''; the functor is sometimes called a '''''J''-shaped diagram'''.<ref>J.P. May, ''A Concise Course in Algebraic Topology'', (1999) The University of Chicago Press, ISBN 0-226-51183-9</ref> The actual objects and morphisms in ''J'' are largely irrelevant, only the way in which they are interrelated matters. The diagram ''D'' is thought of as indexing a collection of objects and morphisms in ''C'' patterned on ''J''.

Although, technically, there is no difference between an individual ''diagram'' and a ''functor'' or between a ''scheme'' and a ''category'', the change in terminology reflects a change in perspective, just as in the set theoretic case: one fixes the index category, and allows the functor (and, secondarily, the target category) to vary.

One is most often interested in the case where the scheme ''J'' is a [[small category|small]] or even [[Finite set|finite]] category. A diagram is said to be '''small''' or '''finite''' whenever ''J'' is.

A morphism of diagrams of type ''J'' in a category ''C'' is a [[natural transformation]] between functors. One can then interpret the '''category of diagrams''' of type ''J'' in ''C'' as the [[functor category]] ''C''''J'', and a diagram is then an object in this category.

==Examples==
* Given any object ''A'' in ''C'', one has the '''constant diagram''', which is the diagram that maps all objects in ''J'' to ''A'', and all morphisms of ''J'' to the identity morphism on ''A''. Notationally, one often uses an underbar to denote the constant diagram: thus, for any object <math>A</math> in ''C'', one has the constant diagram <math>\underline A</math>.

* If ''J'' is a (small) [[discrete category]], then a diagram of type ''J'' is essentially just an [[indexed family]] of objects in ''C'' (indexed by ''J''). When used in the construction of the [[limit (category theory)|limit]], the result is the [[product (category theory)|product]]; for the colimit, one gets the [[coproduct]]. So, for example, when ''J'' is the discrete category with two objects, the resulting limit is just the binary product.

* If ''J'' = -1 ← 0 → +1, then a diagram of type ''J'' (''A'' ← ''B'' → ''C'') is a [[span (category theory)|span]], and its colimit is a [[Pushout (category theory)|pushout]]. If one were to "forget" that the diagram had object ''B'' and the two arrows ''B'' → ''A'', ''B'' → ''C'', the resulting diagram would simply be the discrete category with the two objects ''A'' and ''C'', and the colimit would simply be the binary coproduct. Thus, this example shows an important way in which the idea of the diagram generalizes that of the [[index set]] in set theory: by including the morphisms ''B'' → ''A'', ''B'' → ''C'', one discovers additional structure in constructions built from the diagram, structure that would not be evident if one only had an index set with no relations between the objects in the index.

* If ''J'' = -1 → 0 ← +1, then a diagram of type ''J'' (''A'' → ''B'' ← ''C'') is a [[cospan]], and its limit is a [[Pullback (category theory)|pullback]].

* The index <math>J = 0 \overrightarrow{\to} 1</math> is called "two parallel morphisms", or sometimes the [[free quiver]] or the [[walking quiver]]. A diagram of type ''J'' (<math>f,g\colon X \to Y</math>) is then a [[quiver (mathematics)|quiver]]; its limit is an [[Equaliser (mathematics)|equalizer]], and its colimit is a [[coequalizer]].

* If ''J'' is a [[poset category]], then a diagram of type ''J'' is a family of objects ''D''''i'' together with a unique morphism ''f''''ij'' : ''D''''i'' → ''D''''j'' whenever ''i'' ≤ ''j''. If ''J'' is [[directed set|directed]] then a diagram of type ''J'' is called a [[direct system (mathematics)|direct system]] of objects and morphisms. If the diagram is [[contravariant functor|contravariant]] then it is called an [[inverse system]].

==Cones and limits==

A [[cone (category theory)|cone]] with vertex ''N'' of a diagram ''D'' : ''J'' → ''C'' is a morphism from the constant diagram Δ(''N'') to ''D''. The constant diagram is the diagram which sends every object of ''J'' to an object ''N'' of ''C'' and every morphism to the identity morphism on ''N''.

The [[limit (category theory)|limit]] of a diagram ''D'' is a [[universal cone]] to ''D''. That is, a cone through which all other cones uniquely factor. If the limit exists in a category ''C'' for all diagrams of type ''J'' one obtains a functor
:lim : ''C''''J'' → ''C''
which sends each diagram to its limit.

Dually, the [[colimit]] of diagram ''D'' is a universal cone from ''D''. If the colimit exists for all diagrams of type ''J'' one has a functor
:colim : ''C''''J'' → ''C''
which sends each diagram to its colimit.

== Commutative diagrams ==
{{main|Commutative diagram}}

Diagrams and functor categories are often visualized by [[commutative diagrams]], particularly if the index category is a finite [[poset category]] with few elements: one draws a commutative diagram with a node for every object in the index category, and an arrow for a generating set of morphisms, omitting identity maps and morphisms that can be expressed as compositions. The commutativity corresponds to the uniqueness of a map between two objects in a poset category. Conversely, every commutative diagram represents a diagram (a functor from a poset index category) in this way.

Not every diagram commutes, as not every index category is a poset category:
most simply, the diagram of a single object with an endomorphism (<math>f\colon X \to X</math>), or with two parallel arrows (<math>\bullet \overrightarrow{\to} \bullet</math>; <math>f,g\colon X \to Y</math>) need not commute. Further, diagrams may be impossible to draw (because infinite) or simply messy (because too many objects or morphisms); however, schematic commutative diagrams (for subcategories of the index category, or with ellipses, such as for a directed system) are used to clarify such complex diagrams.

== See also ==
* [[Direct system (mathematics)|Direct system]]
* [[Inverse system]]

==References==
{{reflist}}
*{{cite book | last = Adámek | first = Jiří | coauthors = Horst Herrlich, and George E. Strecker | year = 1990 | url = http://katmat.math.uni-bremen.de/acc/acc.pdf | title = Abstract and Concrete Categories | publisher = John Wiley & Sons | isbn = 0-471-60922-6}} Now available as free on-line edition (4.2MB PDF).
* {{Cite book| last1=Barr| first1=Michael|authorlink1=Michael Barr (mathematician) | last2=Wells| first2=Charles| authorlink2=Charles Wells (mathematician) |year=2002| title=Toposes, Triples and Theories|url=http://www.tac.mta.ca/tac/reprints/articles/12/tr12.pdf|isbn=0-387-96115-1}} Revised and corrected free online version of ''Grundlehren der mathematischen Wissenschaften (278)'' Springer-Verlag, 1983).
* {{nlab|id=diagram}}

== External links ==
* [http://mathworld.wolfram.com/DiagramChasing.html Diagram Chasing] at [[MathWorld]]
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s.
[[Category:Functors]]

[[ko:그림 (범주론)]]
[[nl:Diagram (categorietheorie)]]
[[pl:Diagram (teoria kategorii)]]

Cone (category theory)

2012-12-27T14:07:15Z

Magmalex: /* External links */

In [[category theory]], a branch of [[mathematics]], the '''cone of a functor''' is an abstract notion used to define the [[limit (category theory)|limit]] of that functor. Cones make other appearances in category theory as well.

==Definition==

Let ''F'' : ''J'' → ''C'' be a [[diagram (category theory)|diagram]] in ''C''. Formally, a diagram is nothing more than a [[functor]] from ''J'' to ''C''. The change in terminology reflects the fact that we think of ''F'' as indexing a family of objects and morphisms in ''C''. The [[category (mathematics)|category]] ''J'' is thought of as an "index category". One should consider this in analogy with the concept of an [[indexed family]] of objects in set theory. The primary difference is that here we have [[morphism]]s as well.

Let ''N'' be an object of ''C''. A '''cone''' from ''N'' to ''F'' is a family of morphisms
:<math>\psi_X\colon N \to F(X)\,</math>
for each object ''X'' of ''J'' such that for every morphism ''f'' : ''X'' → ''Y'' in ''J'' the following diagram [[commutative diagram|commutes]]:

[[Image:Functor cone.svg|175px|center|Part of a cone from N to F]]

The (usually infinite) collection of all these triangles can
be (partially) depicted in the shape of a [[cone (geometry)|cone]] with the apex ''N''. The cone ψ is sometimes said to have '''vertex''' ''N'' and '''base''' ''F''.

One can also define the [[dual (category theory)|dual]] notion of a '''cone''' from ''F'' to ''N'' (also called a '''co-cone''') by reversing all the arrows above. Explicitly, a cone from ''F'' to ''N'' is a family of morphisms
:<math>\psi_X\colon F(X)\to N\,</math>
for each object ''X'' of ''J'' such that for every morphism ''f'' : ''X'' → ''Y'' in ''J'' the following diagram commutes:

[[Image:Functor co-cone.svg|175px|center|Part of a cone from F to N]]

==Equivalent formulations==

At first glance cones seem to be slightly abnormal constructions in category theory. They are maps from an ''object'' to a ''functor'' (or vice-versa). In keeping with the spirit of category theory we would like to define them as morphisms or objects in some suitable category. In fact, we can do both.

Let ''J'' be a small category and let ''C''''J'' be the [[category of diagrams]] of type ''J'' in ''C'' (this is nothing more than a [[functor category]]). Define the [[diagonal functor]] Δ : ''C'' → ''C''''J'' as follows: Δ(''N'') : ''J'' → ''C'' is the [[constant functor]] to ''N'' for all ''N'' in ''C''.

If ''F'' is a diagram of type ''J'' in ''C'', the following statements are equivalent:
* ψ is a cone from ''N'' to ''F''
* ψ is a [[natural transformation]] from Δ(''N'') to ''F''
* (''N'', ψ) is an object in the [[comma category]] (Δ ↓ ''F'')

The dual statements are also equivalent:
* ψ is a co-cone from ''F'' to ''N''
* ψ is a [[natural transformation]] from ''F'' to Δ(''N'')
* (''N'', ψ) is an object in the [[comma category]] (''F'' ↓ Δ)

These statements can all be verified by a straightforward application of the definitions. Thinking of cones as natural transformations we see that they are just morphisms in ''C''''J'' with source (or target) a constant functor.

== Category of cones ==

By the above, we can define the '''category of cones to ''F''''' as the comma category (Δ ↓ ''F''). Morphisms of cones are then just morphisms in this category. As one might expect a morphism from a cone (''N'', ψ) to a cone (''L'', φ) is just a morphism ''N'' → ''L'' such that all the "obvious" diagrams commute (see the first diagram in the next section).

Likewise, the '''category of co-cones from ''F''''' is the comma category (''F'' ↓ Δ).

== Universal cones ==

[[Limit (category theory)|Limits and colimits]] are defined as '''universal cones'''. That is, cones through which all other cones factor. A cone φ from ''L'' to ''F'' is a universal cone if for any other cone ψ from ''N'' to ''F'' there is a unique morphism from ψ to φ.

[[Image:Functor cone (extended).svg|250px|center]]

Equivalently, a universal cone to ''F'' is a [[universal morphism]] from Δ to ''F'' (thought of as an object in ''C''''J''), or a [[terminal object]] in (Δ ↓ ''F'').

Dually, a cone φ from ''F'' to ''L'' is a universal cone if for any other cone ψ from ''F'' to ''N'' there is a unique morphism from φ to ψ.

[[Image:Functor co-cone (extended).svg|250px|center]]

Equivalently, a universal cone from ''F'' is a universal morphism from ''F'' to Δ, or an [[initial object]] in (''F'' ↓ Δ).

The limit of ''F'' is a universal cone to ''F'', and the colimit is a universal cone from ''F''. As with all universal constructions, universal cones are not guaranteed to exist for all diagrams ''F'', but if they do exist they are unique up to a unique isomorphism.

== References ==

*{{cite book | first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | edition = 2nd ed. | publisher = Springer | location = New York | isbn = 0-387-98403-8}}

==External links==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]] and Cones.
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Category theory]]
[[Category:Limits (category theory)]]

[[nl:Kegel (categorietheorie)]]

Cone (category theory)

2012-12-27T14:04:12Z

Magmalex: /* External links */ Edited

In [[category theory]], a branch of [[mathematics]], the '''cone of a functor''' is an abstract notion used to define the [[limit (category theory)|limit]] of that functor. Cones make other appearances in category theory as well.

==Definition==

Let ''F'' : ''J'' → ''C'' be a [[diagram (category theory)|diagram]] in ''C''. Formally, a diagram is nothing more than a [[functor]] from ''J'' to ''C''. The change in terminology reflects the fact that we think of ''F'' as indexing a family of objects and morphisms in ''C''. The [[category (mathematics)|category]] ''J'' is thought of as an "index category". One should consider this in analogy with the concept of an [[indexed family]] of objects in set theory. The primary difference is that here we have [[morphism]]s as well.

Let ''N'' be an object of ''C''. A '''cone''' from ''N'' to ''F'' is a family of morphisms
:<math>\psi_X\colon N \to F(X)\,</math>
for each object ''X'' of ''J'' such that for every morphism ''f'' : ''X'' → ''Y'' in ''J'' the following diagram [[commutative diagram|commutes]]:

[[Image:Functor cone.svg|175px|center|Part of a cone from N to F]]

The (usually infinite) collection of all these triangles can
be (partially) depicted in the shape of a [[cone (geometry)|cone]] with the apex ''N''. The cone ψ is sometimes said to have '''vertex''' ''N'' and '''base''' ''F''.

One can also define the [[dual (category theory)|dual]] notion of a '''cone''' from ''F'' to ''N'' (also called a '''co-cone''') by reversing all the arrows above. Explicitly, a cone from ''F'' to ''N'' is a family of morphisms
:<math>\psi_X\colon F(X)\to N\,</math>
for each object ''X'' of ''J'' such that for every morphism ''f'' : ''X'' → ''Y'' in ''J'' the following diagram commutes:

[[Image:Functor co-cone.svg|175px|center|Part of a cone from F to N]]

==Equivalent formulations==

At first glance cones seem to be slightly abnormal constructions in category theory. They are maps from an ''object'' to a ''functor'' (or vice-versa). In keeping with the spirit of category theory we would like to define them as morphisms or objects in some suitable category. In fact, we can do both.

Let ''J'' be a small category and let ''C''''J'' be the [[category of diagrams]] of type ''J'' in ''C'' (this is nothing more than a [[functor category]]). Define the [[diagonal functor]] Δ : ''C'' → ''C''''J'' as follows: Δ(''N'') : ''J'' → ''C'' is the [[constant functor]] to ''N'' for all ''N'' in ''C''.

If ''F'' is a diagram of type ''J'' in ''C'', the following statements are equivalent:
* ψ is a cone from ''N'' to ''F''
* ψ is a [[natural transformation]] from Δ(''N'') to ''F''
* (''N'', ψ) is an object in the [[comma category]] (Δ ↓ ''F'')

The dual statements are also equivalent:
* ψ is a co-cone from ''F'' to ''N''
* ψ is a [[natural transformation]] from ''F'' to Δ(''N'')
* (''N'', ψ) is an object in the [[comma category]] (''F'' ↓ Δ)

These statements can all be verified by a straightforward application of the definitions. Thinking of cones as natural transformations we see that they are just morphisms in ''C''''J'' with source (or target) a constant functor.

== Category of cones ==

By the above, we can define the '''category of cones to ''F''''' as the comma category (Δ ↓ ''F''). Morphisms of cones are then just morphisms in this category. As one might expect a morphism from a cone (''N'', ψ) to a cone (''L'', φ) is just a morphism ''N'' → ''L'' such that all the "obvious" diagrams commute (see the first diagram in the next section).

Likewise, the '''category of co-cones from ''F''''' is the comma category (''F'' ↓ Δ).

== Universal cones ==

[[Limit (category theory)|Limits and colimits]] are defined as '''universal cones'''. That is, cones through which all other cones factor. A cone φ from ''L'' to ''F'' is a universal cone if for any other cone ψ from ''N'' to ''F'' there is a unique morphism from ψ to φ.

[[Image:Functor cone (extended).svg|250px|center]]

Equivalently, a universal cone to ''F'' is a [[universal morphism]] from Δ to ''F'' (thought of as an object in ''C''''J''), or a [[terminal object]] in (Δ ↓ ''F'').

Dually, a cone φ from ''F'' to ''L'' is a universal cone if for any other cone ψ from ''F'' to ''N'' there is a unique morphism from φ to ψ.

[[Image:Functor co-cone (extended).svg|250px|center]]

Equivalently, a universal cone from ''F'' is a universal morphism from ''F'' to Δ, or an [[initial object]] in (''F'' ↓ Δ).

The limit of ''F'' is a universal cone to ''F'', and the colimit is a universal cone from ''F''. As with all universal constructions, universal cones are not guaranteed to exist for all diagrams ''F'', but if they do exist they are unique up to a unique isomorphism.

== References ==

*{{cite book | first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | edition = 2nd ed. | publisher = Springer | location = New York | isbn = 0-387-98403-8}}

==External links==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Category theory]]
[[Category:Limits (category theory)]]

[[nl:Kegel (categorietheorie)]]

Cone (category theory)

2012-12-27T14:02:42Z

Magmalex: Added external links

In [[category theory]], a branch of [[mathematics]], the '''cone of a functor''' is an abstract notion used to define the [[limit (category theory)|limit]] of that functor. Cones make other appearances in category theory as well.

==Definition==

Let ''F'' : ''J'' → ''C'' be a [[diagram (category theory)|diagram]] in ''C''. Formally, a diagram is nothing more than a [[functor]] from ''J'' to ''C''. The change in terminology reflects the fact that we think of ''F'' as indexing a family of objects and morphisms in ''C''. The [[category (mathematics)|category]] ''J'' is thought of as an "index category". One should consider this in analogy with the concept of an [[indexed family]] of objects in set theory. The primary difference is that here we have [[morphism]]s as well.

Let ''N'' be an object of ''C''. A '''cone''' from ''N'' to ''F'' is a family of morphisms
:<math>\psi_X\colon N \to F(X)\,</math>
for each object ''X'' of ''J'' such that for every morphism ''f'' : ''X'' → ''Y'' in ''J'' the following diagram [[commutative diagram|commutes]]:

[[Image:Functor cone.svg|175px|center|Part of a cone from N to F]]

The (usually infinite) collection of all these triangles can
be (partially) depicted in the shape of a [[cone (geometry)|cone]] with the apex ''N''. The cone ψ is sometimes said to have '''vertex''' ''N'' and '''base''' ''F''.

One can also define the [[dual (category theory)|dual]] notion of a '''cone''' from ''F'' to ''N'' (also called a '''co-cone''') by reversing all the arrows above. Explicitly, a cone from ''F'' to ''N'' is a family of morphisms
:<math>\psi_X\colon F(X)\to N\,</math>
for each object ''X'' of ''J'' such that for every morphism ''f'' : ''X'' → ''Y'' in ''J'' the following diagram commutes:

[[Image:Functor co-cone.svg|175px|center|Part of a cone from F to N]]

==Equivalent formulations==

At first glance cones seem to be slightly abnormal constructions in category theory. They are maps from an ''object'' to a ''functor'' (or vice-versa). In keeping with the spirit of category theory we would like to define them as morphisms or objects in some suitable category. In fact, we can do both.

Let ''J'' be a small category and let ''C''''J'' be the [[category of diagrams]] of type ''J'' in ''C'' (this is nothing more than a [[functor category]]). Define the [[diagonal functor]] Δ : ''C'' → ''C''''J'' as follows: Δ(''N'') : ''J'' → ''C'' is the [[constant functor]] to ''N'' for all ''N'' in ''C''.

If ''F'' is a diagram of type ''J'' in ''C'', the following statements are equivalent:
* ψ is a cone from ''N'' to ''F''
* ψ is a [[natural transformation]] from Δ(''N'') to ''F''
* (''N'', ψ) is an object in the [[comma category]] (Δ ↓ ''F'')

The dual statements are also equivalent:
* ψ is a co-cone from ''F'' to ''N''
* ψ is a [[natural transformation]] from ''F'' to Δ(''N'')
* (''N'', ψ) is an object in the [[comma category]] (''F'' ↓ Δ)

These statements can all be verified by a straightforward application of the definitions. Thinking of cones as natural transformations we see that they are just morphisms in ''C''''J'' with source (or target) a constant functor.

== Category of cones ==

By the above, we can define the '''category of cones to ''F''''' as the comma category (Δ ↓ ''F''). Morphisms of cones are then just morphisms in this category. As one might expect a morphism from a cone (''N'', ψ) to a cone (''L'', φ) is just a morphism ''N'' → ''L'' such that all the "obvious" diagrams commute (see the first diagram in the next section).

Likewise, the '''category of co-cones from ''F''''' is the comma category (''F'' ↓ Δ).

== Universal cones ==

[[Limit (category theory)|Limits and colimits]] are defined as '''universal cones'''. That is, cones through which all other cones factor. A cone φ from ''L'' to ''F'' is a universal cone if for any other cone ψ from ''N'' to ''F'' there is a unique morphism from ψ to φ.

[[Image:Functor cone (extended).svg|250px|center]]

Equivalently, a universal cone to ''F'' is a [[universal morphism]] from Δ to ''F'' (thought of as an object in ''C''''J''), or a [[terminal object]] in (Δ ↓ ''F'').

Dually, a cone φ from ''F'' to ''L'' is a universal cone if for any other cone ψ from ''F'' to ''N'' there is a unique morphism from φ to ψ.

[[Image:Functor co-cone (extended).svg|250px|center]]

Equivalently, a universal cone from ''F'' is a universal morphism from ''F'' to Δ, or an [[initial object]] in (''F'' ↓ Δ).

The limit of ''F'' is a universal cone to ''F'', and the colimit is a universal cone from ''F''. As with all universal constructions, universal cones are not guaranteed to exist for all diagrams ''F'', but if they do exist they are unique up to a unique isomorphism.

== References ==

*{{cite book | first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | edition = 2nd ed. | publisher = Springer | location = New York | isbn = 0-387-98403-8}}

==External links==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* [[André Joyal]], [http://ncatlab.org/nlab CatLab], a wiki project dedicated to the exposition of categorical mathematics
* {{cite web | first = Chris | last = Hillman | title = A Categorical Primer | id = {{citeseerx|10.1.1.24.3264}} | postscript = : }} formal introduction to category theory.
* J. Adamek, H. Herrlich, G. Stecker, [http://katmat.math.uni-bremen.de/acc/acc.pdf Abstract and Concrete Categories-The Joy of Cats]
*{{sep entry|category-theory|Category Theory|Jean-Pierre Marquis}} with an extensive bibliography.
* [http://www.mta.ca/~cat-dist/ List of academic conferences on category theory]
* Baez, John, 1996,"[http://math.ucr.edu/home/baez/week73.html The Tale of ''n''-categories.]" An informal introduction to higher order categories.
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Category theory]]
[[Category:Limits (category theory)]]

[[nl:Kegel (categorietheorie)]]

Law of excluded middle

2012-12-04T23:29:57Z

Magmalex: /* See also */ Consequentia mirabilis

{{distinguish|fallacy of the excluded middle}}

:''This article uses forms of [[Mathematical logic|logical]] notation. For a concise description of the symbols used in this notation, see [[List of logic symbols]].''

In [[logic]], the '''law of excluded middle''' (or the '''principle of excluded middle''') is the third of the [[three classic laws of thought]]. It states that for any [[proposition]], either that proposition is true, or its [[negation]] is.

The law is also known as the '''law''' (or '''principle''') '''of the excluded third''' (or of '''the excluded middle'''), or, in [[Latin]], '''''principium tertii exclusi'''''. Yet another Latin designation for this law is '''''tertium non datur''''': "no third (possibility) is given".

The earliest known formulation is Aristotle's [[principle of non-contradiction]], first proposed in ''[[On Interpretation]],''<ref>Geach p. 74</ref> where he says that of two [[contradictory]] propositions (i.e. where one proposition is the negation of the other) one must be true, and the other false.<ref>''On Interpretation'', c. 9</ref> He also states it as a principle in the ''[[Metaphysics (Aristotle)|Metaphysics]]'' book 3, saying that it is necessary in every case to affirm or deny,<ref>''Metaphyics 2, 996b 26–30</ref> and that it is impossible that there should be anything between the two parts of a contradiction.<ref>''Metaphyics 7, 1011b 26–27</ref> The principle was stated as a [[theorem]] of [[propositional calculus|propositional logic]] by [[Bertrand Russell|Russell]] and [[Alfred North Whitehead|Whitehead]] in ''[[Principia Mathematica]]'' as:

:: <math>\mathbf{*2\cdot11}. \ \ \vdash . \ p \ \mathbf{v} \thicksim p</math><ref>{{citation|author=[[Alfred North Whitehead]], [[Bertrand Russel]]|title=Principia Mathematica|publisher=[[Cambridge]]|year=1910|pages=105}}[http://name.umdl.umich.edu/aat3201.0001.001]</ref>

The principle should not be confused with the [[principle of bivalence]], which states that every proposition is either true or false, and has only a semantical formulation.

== Classic laws of thought ==
The principle of excluded middle, along with its complement, the [[law of contradiction]] (the second of the [[three classic laws of thought]]), are correlates of the [[law of identity]] (the first of these laws). Because the principle of identity intellectually partitions the Universe into exactly two parts: "self" and "other", it creates a [[dichotomy]] wherein the two parts are "mutually exclusive" and "jointly exhaustive". The principle of contradiction is merely an expression of the mutually exclusive aspect of that dichotomy, and the principle of excluded middle is an expression of its jointly exhaustive aspect.

== Analogous laws ==
Some systems of logic have different but analogous laws. For some finite ''n''-valued logics, there is an analogous law called the '''law of excluded ''n''+1th'''. If negation is [[cyclic negation|cyclic]] and "∨" is a "max operator", then the law can be expressed in the object language by (P ∨ ~P ∨ ~~P ∨ ... ∨ ~...~P), where "~...~" represents ''n''−1 negation signs and "∨ ... ∨" ''n''−1 disjunction signs. It is easy to check that the sentence must receive at least one of the ''n'' [[truth value]]s (and not a value that is not one of the ''n'').

Other systems reject the law entirely.

== Examples ==

For example, if ''P'' is the proposition:

:''Socrates is mortal.''

then the law of excluded middle holds that the [[logical disjunction]]:

:''Either Socrates is mortal, or it is not the case that Socrates is mortal.''

is true by virtue of its form alone. That is, the "middle" position, that Socrates is neither mortal nor not-mortal, is excluded by logic, and therefore either the first possibility (''Socrates is mortal'') or its negation (''it is not the case that Socrates is mortal'') must be true.

An example of an argument that depends on the law of excluded middle follows.<ref>This well-known example of a non-constructive proof depending on the law of excluded middle can be found in many places, for example: Megill, Norm. ''Metamath: A Computer Language for Pure Mathematics'', footnote on p. 17,[http://us.metamath.org/index.html#book] and Davis 2000:220, footnote 2.</ref> We seek to prove that there exist two [[irrational number]]s <math>a</math> and <math>b</math> such that

:<math>a^b</math> is rational.

It is known that <math>\sqrt{2}</math> is irrational (see [[Square_root_of_2#Proofs_of_irrationality|proof]]). Consider the number

:<math>\sqrt{2}^{\sqrt{2}}</math>

Clearly (excluded middle) this number is either rational or irrational. If it is rational, the proof is complete, and
:<math>a=\sqrt{2}</math> and <math>b=\sqrt{2}</math>

But if <math>\sqrt{2}^{\sqrt{2}}</math> is irrational, then let

:<math>a=\sqrt{2}^{\sqrt{2}}</math> and <math>b=\sqrt{2}</math>

Then

:<math>a^b = \left(\sqrt{2}^{\sqrt{2}}\right)^{\sqrt{2}} = \sqrt{2}^{\left(\sqrt{2}\cdot\sqrt{2}\right)} = \sqrt{2}^2 = 2,</math>

and 2 is certainly rational. This concludes the proof.

In the above argument, the assertion "this number is either rational or irrational" invokes the law of excluded middle. An [[intuitionist]], for example, would not accept this argument without further support for that statement. This might come in the form of a proof that the number in question is in fact irrational (or rational, as the case may be); or a finite algorithm that could determine whether the number is rational or not.

=== The Law in non-constructive proofs over the infinite ===

The above proof is an example of a ''[[non-constructive]]'' proof disallowed by intuitionists:
{{quote|The proof is nonconstructive because it doesn't give specific numbers a and b that satisfy the theorem but only two separate possibilities, one of which must work. (Actually <math>a=\sqrt{2}^{\sqrt{2}}</math> is irrational but there is no known easy proof of that fact.) (Davis 2000:220)}}

By ''non-constructive'' Davis means that "a proof that there actually are mathematic entities satisfying certain conditions would have to provide a method to exhibit explicitly the entities in question." (p. 85). Such proofs presume the existence of a totality that is complete, a notion disallowed by intuitionists when extended to the ''infinite''—for them the infinite can never be completed:

{{quote|In classical mathematics there occur ''non-constructive'' or ''indirect'' existence proofs, which intuitionists do not accept. For example, to prove ''there exists an n such that P''(''n''), the classical mathematician may deduce a contradiction from the assumption for all ''n'', not ''P''(''n''). Under both the classical and the intuitionistic logic, by reductio ad absurdum this gives ''not for all n, not P''(''n''). The classical logic allows this result to be transformed into ''there exists an n such that P''(''n''), but not in general the intuitionistic... the classical meaning, that somewhere in the completed infinite totality of the natural numbers there occurs an ''n'' such that ''P''(''n''), is not available to him, since he does not conceive the natural numbers as a completed totality.<ref>In a comparative analysis (pp. 43–59) of the three "-isms" (and their foremost spokesmen)—Logicism (Russell and Whitehead), Intuitionism (Brouwer) and Formalism (Hilbert)—Kleene turns his thorough eye toward intuitionism, its "founder" [[Brouwer]], and the intuitionists' complaints with respect to the law of excluded middle as applied to arguments over the "completed infinite".</ref> (Kleene 1952:49–50)}}

Indeed, [[David Hilbert|Hilbert]] and [[Luitzen Egbertus Jan Brouwer|Brouwer]] both give examples of the law of excluded middle extended to the infinite. Hilbert's example: "the assertion that either there are only finitely many prime numbers or there are infinitely many" (quoted in Davis 2000:97); and Brouwer's: "Every mathematical species is either finite or infinite." (Brouwer 1923 in van Heijenoort 1967:336).

In general, intuitionists allow the use of the law of excluded middle when it is confined to discourse over finite collections (sets), but not when it is used in discourse over infinite sets (e.g. the natural numbers). Thus intuitionists absolutely disallow the blanket assertion: "For all propositions ''P'' concerning infinite sets ''D'': ''P'' or ~''P''" (Kleene 1952:48).

:''For more about the conflict between the intuitionists (e.g. Brouwer) and the formalists (Hilbert) see [[Foundations of mathematics]] and [[Intuitionism]].''

Putative counterexamples to the law of excluded middle include the [[liar paradox]] or [[Quine's Paradox]]. Certain resolutions of these paradoxes, particularly [[Graham Priest]]'s [[dialetheism]] as formalised in LP, have the law of excluded middle as a theorem, but resolve out the Liar as both true and false. In this way, the law of excluded middle is true, but because truth itself, and therefore disjunction, is not exclusive, it says next to nothing if one of the disjuncts is paradoxical, or both true and false.

== History ==

=== Aristotle ===

[[Aristotle]] wrote that ambiguity can arise from the use of ambiguous names, but cannot exist in the facts themselves:

{{quote|It is impossible, then, that "being a man" should mean precisely "not being a man", if "man" not only signifies something about one subject but also has one significance. ... And it will not be possible to be and not to be the same thing, except in virtue of an ambiguity, just as if one whom we call "man", and others were to call "not-man"; but the point in question is not this, whether the same thing can at the same time be and not be a man in name, but whether it can be in fact. (''Metaphysics'' 4.4, W.D. Ross (trans.), GBWW 8, 525–526).}}

Aristotle's assertion that "...it will not be possible to be and not to be the same thing", which would be written in propositional logic as ¬ (''P'' ∧ ¬''P''), is a statement modern logicians could call the law of excluded middle (''P'' ∨ ¬''P''), as distribution of the negation of Aristotle's assertion makes them equivalent, regardless that the former claims that no statement is ''both'' true and false, while the latter requires that any statement is ''either'' true or false.

However, Aristotle also writes, "since it is impossible that contradictories should be at the same time true of the same thing, obviously contraries also cannot belong at the same time to the same thing" (Book IV, CH 6, p. 531). He then proposes that "there cannot be an intermediate between contradictories, but of one subject we must either affirm or deny any one predicate" (Book IV, CH 7, p. 531). In the context of Aristotle's [[traditional logic]], this is a remarkably precise statement of the law of excluded middle, ''P'' ∨ ¬''P''.

=== Leibniz ===

{{quote|Its usual form, "Every judgment is either true or false" [footnote 9]..."(from Kolmogorov in van Heijenoort, p. 421) footnote 9: "This is [[Gottfried Wilhelm Leibniz|Leibniz]]'s very simple formulation (see ''[[New Essays on Human Understanding|Nouveaux Essais]]'', IV,2)...." (ibid p 421)}}

=== Bertrand Russell and ''Principia Mathematica'' ===

[[Bertrand Russell]] asserts a distinction between the "law of excluded middle" and the "law of noncontradiction". In ''[[The Problems of Philosophy]]'', he cites three "Laws of Thought" as more or less "self-evident" or "a priori" in the sense of Aristotle:

::1. [[Law of identity]]: "Whatever is, is."
::2. [[Law of noncontradiction]]: "Nothing can both be and not be."
::3. '''Law of excluded middle''': "Everything must either be or not be."

::These three laws are samples of self-evident logical principles... (p. 72)

It is correct, at least for bivalent logic—i.e. it can be seen with a [[Karnaugh map]]—that Russell's Law (2) removes "the middle" of the [[Logical disjunction|inclusive-or]] used in his law (3). And this is the point of Reichenbach's demonstration that some believe the [[Exclusive or|''exclusive''-or]] should take the place of the [[Logical disjunction|''inclusive''-or]].

About this issue (in admittedly very technical terms) Reichenbach observes:

::The tertium non datur
::29. (''x'')[''f''(''x'') ∨ ~''f''(''x'')]
::is not exhaustive in its major terms and is therefore an inflated formula. This fact may perhaps explain why some people consider it unreasonable to write (29) with the inclusive-'or', and want to have it written with the sign of the ''exclusive''-'or'

::30. (''x'')[''f''(''x'') ⊕ ~''f''(''x'')], where the symbol "⊕" signifies [[exclusive-or]]<ref>The original symbol as used by Reichenbach is an upside down V, nowadays used for AND. The AND for Reichenbach is the same as that used in Principia Mathematica -- a "dot" cf p. 27 where he shows a truth table where he defines "a.b". Reichenbach defines the exclusive-or on p. 35 as "the negation of the equivalence". One sign used nowadays is a circle with a + in it, i.e. ⊕ (because in binary, a ⊕ b yields modulo-2 addition -- addition without carry). Other signs are ≢ (not identical to), or ≠ (not equal to).</ref>
::in which form it would be fully exhaustive and therefore nomological in the narrower sense. (Reichenbach, p. 376)

In line (30) the "(x)" means "for all" or "for every", a form used by Russell and Reichenbach; today the symbolism is usually <math>\forall</math> ''x''. Thus an example of the expression would look like this:

* (''pig''): (''Flies''(''pig'') ⊕ ~''Flies''(''pig''))
* (For all instances of "pig" seen and unseen): ("Pig does fly" or "Pig does not fly" but not both simultaneously)

==== A formal definition from ''Principia Mathematica'' ====

''[[Principia Mathematica]]'' (''PM'') defines the law of excluded middle formally:

{{quote|*2.1 : ~p ∨ p (''PM'' p. 101)
Example: Either it is true that "this is red", or it is true that "this is not red". Hence it is true that "this is red or this is not red". (See below for more about how this is derived from the primitive axioms).}}

So just what is "truth" and "falsehood"? At the opening ''PM'' quickly announces some definitions:

{{quote|''Truth-values''. The "truth-values" of a proposition is ''truth'' if it is true and ''falsehood'' if it is false* [*This phrase is due to Frege]...the truth-value of "p ∨ q" is truth if the truth-value of either p or q is truth, and is falsehood otherwise ... that of "~ p" is the opposite of that of p..." (p. 7-8)}}

This is not much help. But later, in a much deeper discussion, ("Definition and systematic ambiguity of Truth and Falsehood" Chapter II part III, p. 41 ff ) ''PM'' defines truth and falsehood in terms of a relationship between the "a" and the "b" and the "percipient". For example "This 'a' is 'b'" (e.g. "This 'object a' is 'red'") really means "'object a' is a sense-datum" and "'red' is a sense-datum", and they "stand in relation" to one another and in relation to "I". Thus what we really mean is: "I perceive that 'This object a is red'" and this is an undeniable-by-3rd-party "truth".

''PM'' further defines a distinction between a "sense-datum" and a "sensation":
{{quote|That is, when we judge (say) "this is red", what occurs is a relation of three terms, the mind, and "this", and "red". On the other hand, when we perceive "the redness of this", there is a relation of two terms, namely the mind and the complex object "the redness of this" (pp. 43–44).}}

Russell reiterated his distinction between "sense-datum" and "sensation" in his book ''The Problems of Philosophy'' (1912) published at the same time as ''PM'' (1910–1913):
{{quote|Let us give the name of "sense-data" to the things that are immediately known in sensation: such things as colours, sounds, smells, hardnesses, roughnesses, and so on. We shall give the name "sensation" to the experience of being immediately aware of these things... The colour itself is a sense-datum, not a sensation. (p. 12)}}

Russell further described his reasoning behind his definitions of "truth" and "falsehood" in the same book (Chapter XII ''Truth and Falsehood'').

==== Consequences of the law of excluded middle in ''Principia Mathematica'' ====

From the law of excluded middle, formula ✸2.1 in ''[[Principia Mathematica]],'' Whitehead and Russell derive some of the most powerful tools in the logician's argumentation toolkit. (In ''Principia Mathematica,'' formulas and propositions are identified by a leading asterisk and two numbers, such as "✸2.1".)

✸2.1 ~''p'' ∨ ''p'' "This is the Law of excluded middle" (''PM'', p. 101).

The proof of ✸2.1 is roughly as follows: "primitive idea" 1.08 defines ''p'' → ''q'' = ~''p'' ∨ ''q''. Substituting ''p'' for ''q'' in this rule yields ''p'' → ''p'' = ~''p'' ∨ ''p''. Since ''p'' → ''p'' is true (this is Theorem 2.08, which is proved separately), then ~''p'' ∨ ''p'' must be true.

✸2.11 ''p'' ∨ ~''p'' (Permutation of the assertions is allowed by axiom 1.4) 
✸2.12 ''p'' → ~(~''p'') (Principle of double negation, part 1: if "this rose is red" is true then it's not true that "'this rose is not-red' is true".) 
✸2.13 ''p'' ∨ ~{~(~''p'')} (Lemma together with 2.12 used to derive 2.14) 
✸2.14 ~(~''p'') → ''p'' (Principle of double negation, part 2) 
✸2.15 (~''p'' → ''q'') → (~''q'' → ''p'') (One of the four "Principles of transposition". Similar to 1.03, 1.16 and 1.17. A very long demonstration was required here.) 
✸2.16 (''p'' → ''q'') → (~''q'' → ~''p'') (If it's true that "If this rose is red then this pig flies" then it's true that "If this pig doesn't fly then this rose isn't red.") 
✸2.17 ( ~''p'' → ~''q'' ) → (''q'' → ''p'') (Another of the "Principles of transposition".) 
✸2.18 (~''p'' → ''p'') → ''p'' (Called "The complement of ''reductio ad absurdum''. It states that a proposition which [[Logical consequence|follows from]] the hypothesis of its own falsehood is true" (''PM'', pp. 103–104).)

Most of these theorems—in particular ✸2.1, ✸2.11, and ✸2.14—are rejected by intuitionism. These tools are recast into another form that Kolmogorov cites as "Hilbert's four axioms of implication" and "Hilbert's two axioms of negation" (Kolmogorov in van Heijenoort, p. 335).

Propositions ✸2.12 and ✸2.14, "double negation":
The [[Intuitionism|intuitionist]] writings of [[L. E. J. Brouwer]] refer to what he calls "the ''principle of the reciprocity of the multiple species'', that is, the principle that for every system the correctness of a property follows from the impossibility of the impossibility of this property" (Brouwer, ibid, p. 335).

This principle is commonly called "the principle of double negation" (''PM'', pp. 101–102). From the law of excluded middle (✸2.1 and ✸2.11), ''PM'' derives principle ✸2.12 immediately. We substitute ~''p'' for ''p'' in 2.11 to yield ~''p'' ∨ ~(~''p''), and by the definition of implication (i.e. 1.01 p → q = ~p ∨ q) then ~p ∨ ~(~p)= p → ~(~p). QED (The derivation of 2.14 is a bit more involved.)



== Criticisms ==
Many modern logic systems reject the law of excluded middle, replacing it with the concept of [[negation as failure]]. That is, there is a third possibility: the truth of a proposition is unknown. The principle of negation-as-failure is used as a foundation for [[autoepistemic logic]], and is widely used in [[logic programming]]. In these systems, the programmer is free to assert the law of excluded middle as a true fact; it is not built-in ''a priori'' into these systems.

Mathematicians such as [[Luitzen Egbertus Jan Brouwer|L. E. J. Brouwer]] and [[Arend Heyting]] contested the usefulness of the law of excluded middle in the context of the modern mathematics <ref>
[http://books.google.co.uk/books?id=uUC30fqhdlAC&pg=PA138&dq=the+principle+of+excluded+middle+criticism+of&hl=en&ei=tzXUTfy-I8rysgaU1vTiAg&sa=X&oi=book_result&ct=result&resnum=3&ved=0CDYQ6AEwAg#v=onepage&q=the%20principle%20of%20excluded%20middle%20criticism%20of&f=false "Proof and Knowledge in Mathematics" by Michael Detlefsen]</ref>

[[Stéphane Lupasco]] (1900-1988) has also substantiated the [[Law of included middle|logic of the included middle]], showing that it constitutes "a true logic, mathematically formalized, multivalent (with three values: A, non-A, and T) and non-contradictory".<ref>
[http://www.theatlas.org/index.php?option=com_phocadownload&view=category&id=21:engineering-science&download=181:methodology-of-transdisciplinarity-levels-of-reality-logic-of-the-included-middle-and-complexity&Itemid=157 Basarab Nicolescu, 2010, "Methodology of Transdisciplinarity - Levels of Reality, Logic of the Included Middle and Complexity" Transdisciplinary Journal of Engineering and Science, vol 2010, p.31]</ref> Quantum mechanics is said to be an exemplar of this logic, through the [[Quantum superposition|superposition]] of "yes" and "no" quantum states; the included middle is also mentioned as one of the three axioms of [[transdisciplinarity]], without which reality cannot be understood.<ref>[http://www.theatlas.org/index.php?option=com_phocadownload&view=category&id=21:engineering-science&download=181:methodology-of-transdisciplinarity-levels-of-reality-logic-of-the-included-middle-and-complexity&Itemid=157 Basarab Nicolescu, 2010, "Methodology of Transdisciplinarity - Levels of Reality, Logic of the Included Middle and Complexity" Transdisciplinary Journal of Engineering and Science, vol 2010, p.31]</ref>

== See also ==
* [[Law of bivalence]]
* [[Law of excluded fourth]]
* [[Laws of thought]]
* [[Liar's paradox]]
* [[Logical graph]]s: a graphical syntax for propositional logic
* [[Peirce's law]]: another way of turning intuition classical
* [[Ternary logic]]
* [[Intuitionistic logic]]
* [[Diaconescu's theorem]]
* [[Consequentia mirabilis]]

== Footnotes ==
{{reflist}}

== References ==
* [[Thomas Aquinas|Aquinas, Thomas]], "[[Summa Theologica]]", [[Fathers of the English Dominican Province]] (trans.), [[Daniel J. Sullivan]] (ed.), vols. 19–20 in [[Robert Maynard Hutchins]] (ed.), ''[[Great Books of the Western World]]'', Encyclopædia Britannica, Inc., Chicago, IL, 1952. Cited as GB 19–20.
* [[Aristotle]], "[[Metaphysics (Aristotle)|Metaphysics]]", [[W.D. Ross]] (trans.), vol. 8 in [[Robert Maynard Hutchins]] (ed.), ''[[Great Books of the Western World]]'', Encyclopædia Britannica, Inc., Chicago, IL, 1952. Cited as GB 8. 1st published, W.D. Ross (trans.), ''The Works of Aristotle'', Oxford University Press, Oxford, UK.
* [[Martin Davis]] 2000, ''Engines of Logic: Mathematicians and the Origin of the Computer", W. W. Norton & Company, NY, ISBN 0-393-32229-7 pbk.
* [[John Dawson Jr.|Dawson, J.]], ''Logical Dilemmas, The Life and Work of Kurt Gödel'', A.K. Peters, Wellesley, MA, 1997.
* [[Jean van Heijenoort|van Heijenoort, J.]], ''From Frege to Gödel, A Source Book in Mathematical Logic, 1879–1931'', Harvard University Press, Cambridge, MA, 1967. Reprinted with corrections, 1977.
* Luitzen Egbertus Jan [[Brouwer]], 1923, ''On the significance of the principle of excluded middle in mathematics, especially in function theory'' [reprinted with commentary, p. 334, van Heijenoort]
* Andrei Nikolaevich [[Kolmogorov]], 1925, ''On the principle of excluded middle'', [reprinted with commentary, p. 414, van Heijenoort]
* Luitzen Egbertus Jan [[Brouwer]], 1927, ''On the domains of definitions of functions'',[reprinted with commentary, p. 446, van Heijenoort] Although not directly germane, in his (1923) Brouwer uses certain words defined in this paper.
* Luitzen Egbertus Jan [[Brouwer]], 1927(2), ''Intuitionistic reflections on formalism'',[reprinted with commentary, p. 490, van Heijenoort]
* [[Stephen C. Kleene]] 1952 original printing, 1971 6th printing with corrections, 10th printing 1991, ''Introduction to Metamathematics'', North-Holland Publishing Company, Amsterdam NY, ISBN 0-7204-2103-9.
* [[William Kneale|Kneale, W.]] and [[Martha Kneale|Kneale, M.]], ''The Development of Logic'', Oxford University Press, Oxford, UK, 1962. Reprinted with corrections, 1975.
* [[Alfred North Whitehead]] and [[Bertrand Russell]], ''Principia Mathematica to *56'', Cambridge at the University Press 1962 (Second Edition of 1927, reprinted). Extremely difficult because of arcane symbolism, but a must-have for serious logicians.
* [[Bertrand Russell]], ''The Problems of Philosophy, With a New Introduction by John Perry'', Oxford University Press, New York, 1997 edition (first published 1912). Very easy to read: Russell was a wonderful writer.
* [[Bertrand Russell]], ''The Art of Philosophizing and Other Essays'', Littlefield, Adams & Co., Totowa, NJ, 1974 edition (first published 1968). Includes a wonderful essay on "The Art of drawing Inferences".
* [[Hans Reichenbach]], ''Elements of Symbolic Logic'', Dover, New York, 1947, 1975.
* [[Tom M. Mitchell|Tom Mitchell]], ''Machine Learning'', WCB McGraw-Hill, 1997.
* [[Constance Reid]], ''Hilbert'', Copernicus: Springer-Verlag New York, Inc. 1996, first published 1969. Contains a wealth of biographical information, much derived from interviews.
* [[Bart Kosko]], ''Fuzzy Thinking: The New Science of Fuzzy Logic'', Hyperion, New York, 1993. Fuzzy thinking at its finest. But a good introduction to the concepts.
* [[David Hume]], ''An Inquiry Concerning Human Understanding'', reprinted in Great Books of the Western World Encyclopædia Britannica, Volume 35, 1952, p. 449 ff. This work was published by Hume in 1758 as his rewrite of his "juvenile" ''Treatise of Human Nature: Being An attempt to introduce the experimental method of Reasoning into Moral Subjects Vol. I, Of The Understanding'' first published 1739, reprinted as: David Hume, ''A Treatise of Human Nature'', Penguin Classics, 1985. Also see: [[David Applebaum]], ''The Vision of Hume'', Vega, London, 2001: a reprint of a portion of ''An Inquiry'' starts on p. 94 ff

== External links ==
* [http://plato.stanford.edu/entries/contradiction/ "Contradiction" entry] in the [[Stanford Encyclopedia of Philosophy]]

[[Category:Classical logic]]
[[Category:Articles containing proofs]]
[[Category:Theorems in propositional logic]]

[[ca:Principi del tercer exclòs]]
[[cs:Zákon o vyloučení třetího]]
[[de:Satz vom ausgeschlossenen Dritten]]
[[et:Välistatud kolmanda reegel]]
[[el:Αρχή αποκλειόμενου μέσου]]
[[es:Principio del tercero excluido]]
[[eo:Leĝo de neekzisto de tria eblo]]
[[fa:اصل طرد ثالث]]
[[fr:Principe du tiers exclu]]
[[ko:배중률]]
[[is:Lögmálið um annað tveggja]]
[[it:Tertium non datur]]
[[he:כלל השלישי מן הנמנע]]
[[ky:Үчүнчүлүктү четке кагуу принциби]]
[[hu:Kizárt harmadik elve]]
[[nl:Wet van de uitgesloten derde]]
[[ja:排中律]]
[[no:Loven om den ekskluderte tredje]]
[[pms:Prinsipi dël ters barà fòra]]
[[pl:Prawo wyłączonego środka]]
[[pt:Lei do terceiro excluído]]
[[ru:Закон исключённого третьего]]
[[fi:Kolmannen poissuljetun laki]]
[[sv:Lagen om det uteslutna tredje]]
[[uk:Закон виключеного третього]]
[[zh:排中律]]

Principle of explosion

2012-12-04T23:20:23Z

Magmalex: /* See also */ Consequentia mirabilis

{{About|the ex falso quodlibet logical principle|the file tagging software|Ex Falso (software)}}

{{Confusing|date=December 2007}}
{{One source|date= December 2011}}

The '''principle of explosion''', ([[Latin]]: ''ex falso quodlibet'' or ''ex contradictione sequitur quodlibet'', "from a contradiction, anything follows") or the '''principle of Pseudo-Scotus''',{{Citation needed|date=September 2011}} is the law of [[classical logic]], [[intuitionistic logic|intuitionistic logic]] and similar logical systems, according to which any statement can be proven from a contradiction.<ref>Carnielli, W. and Marcos, J. (2001) [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.107.70 "Ex contradictione non sequitur quodlibet"] ''Proc. 2nd Conf. on Reasoning and Logic'' (Bucharest, July 2000)</ref> That is, once a contradiction has been asserted, any proposition (or its negation) can be inferred from it. In symbolic terms, the principle of explosion can be expressed in the following way (where "<math>\vdash</math>" symbolizes the relation of [[logical consequence]]):

: <math>\{ \phi , \lnot \phi \} \vdash \psi</math>
: ''or''
: <math>\bot \to P</math>.

This can be read as, "If one claims something is both true (<math>\phi\,</math>) and not true (<math>\lnot \phi</math>), one can logically derive ''any'' conclusion (<math>\psi</math>)."

==Arguments for explosion==

===An informal argument===

Consider two inconsistent statements, “All lemons are yellow” and "Not all lemons are yellow", and suppose for the sake of argument that both are simultaneously true. If that is the case, we can then prove anything, for instance that "Santa Claus exists", by using the following argument: We know "All lemons are yellow". And from this we can infer that “All lemons are yellow" or "Santa Claus exists” (or both) - we started with the assertion that "all lemons are yellow", so this expanded statement must be true. Since "all lemons are yellow or Santa Claus exists (or both)", yet we earlier asserted that "not all lemons are yellow", the only possibility remaining is that "Santa Claus exists".

In more formal terms, there are two basic kinds of argument for the principle of explosion, semantic and proof-theoretic.

===The semantic argument===

The first argument is ''semantic'' or ''[[model theory|model-theoretic]]'' in nature. A sentence <math>\psi</math> is a ''[[semantic consequence]]'' of a set of sentences <math>\Gamma</math> only if every model of <math>\Gamma</math> is a model of <math>\psi</math>. But there is no model of the contradictory set <math>\{\phi , \lnot \phi \}</math>. [[A fortiori]], there is no model of <math>\{\phi , \lnot \phi \}</math> that is not a model of <math>\psi</math>. Thus, vacuously, every model of <math>\{\phi , \lnot \phi \}</math> is a model of <math>\psi</math>. Thus <math>\psi</math> is a semantic consequence of <math>\{\phi , \lnot \phi \}</math>.

=== The proof-theoretic argument ===

The second type of argument is ''[[proof theory|proof-theoretic]]'' in nature. Consider the following derivations:

#<math>\phi \wedge \neg \phi\,</math>
#:assumption
#<math>\phi\,</math>
#:from (1) by [[conjunction elimination]]
#<math>\neg \phi\,</math>
#:from (1) by conjunction elimination
#<math>\phi \vee \psi\,</math>
#:from (2) by [[disjunction introduction]]
#<math>\psi\,</math>
#:from (3) and (4) by [[disjunctive syllogism]]
#<math>(\phi \wedge \neg \phi) \to \psi</math>
#:from (5) by [[conditional proof]] (discharging assumption 1)

This is just the symbolic version of the informal argument given above, with <math>\phi</math> standing for "all lemons are yellow" and <math>\psi</math> standing for "Santa Claus exists". From "all lemons are yellow and not all lemons are yellow" (1), we infer "all lemons are yellow" (2) and "not all lemons are yellow" (3); from "all lemons are yellow" (2), we infer "all lemons are yellow or Santa Claus exists" (4); and from "not all lemons are yellow" (3) and "all lemons are yellow or Santa Claus exists" (4), we infer "Santa Claus exists" (5). Hence, if all lemons are yellow and not all lemons are yellow, then Santa Claus exists.

Or:

#<math>\phi \wedge \neg \phi\,</math>
#:hypothesis
#<math>\phi\,</math>
#:from (1) by conjunction elimination
#<math>\neg \phi\,</math>
#:from (1) by conjunction elimination
#<math>\neg \psi\,</math>
#:hypothesis
#<math>\phi\,</math>
#:reiteration of (2)
#<math>\neg \psi \to \phi</math>
#:from (4) to (5) by [[deduction theorem]]
#<math>( \neg \phi \to \neg \neg \psi)</math>
#:from (6) by [[contraposition]]
#<math>\neg \neg \psi</math>
#:from (3) and (7) by [[modus ponens]]
#<math>\psi\,</math>
#:from (8) by [[double negation elimination]]
#<math>(\phi \wedge \neg \phi) \to \psi</math>
#:from (1) to (9) by deduction theorem

Or:

#<math>\phi \wedge \neg \phi\,</math>
#:assumption
#<math>\neg \psi\,</math>
#:assumption
#<math>\phi\,</math>
#:from (1) by conjunction elimination
#<math>\neg \phi\,</math>
#:from (1) by conjunction elimination
#<math>\neg \neg \psi\,</math>
#:from (3) and (4) by [[reductio ad absurdum]] (discharging assumption 2)
#<math>\psi\,</math>
#:from (5) by double negation elimination
#<math>(\phi \wedge \neg \phi) \to \psi</math>
#:from (6) by conditional proof (discharging assumption 1)

== Addressing the principle ==

[[Paraconsistent logic]]s have been developed that allow for sub-contrary forming operators. [[Formal semantics (logic)|Model-theoretic]] paraconsistent logicians often deny the assumption that there can be no model of <math>\{\phi , \lnot \phi \}</math> and devise semantical systems in which there are such models. Alternatively, they reject the idea that propositions can be classified as true or false. [[Proof-theoretic semantics|Proof-theoretic]] paraconsistent logics usually deny the validity of one of the steps necessary for deriving an explosion, typically including disjunctive syllogism, disjunction introduction, and [[reductio ad absurdum]].

==See also==
* [[Dialetheism]] – belief in the existence of true contradictions
* [[Law of excluded middle]] – every proposition is either true or not true
* [[Law of noncontradiction]] – no proposition can be both true and not true
* [[Paraconsistent logic]] – a [[modal logic]] used to address contradictions
* [[Paradox of entailment]] – a seeming paradox derived from the principle of explosion
* [[Reductio ad absurdum]] – concluding that a proposition is false because it produces a contradiction
* [[Trivialism]] – the belief that all statements of the form "P and not-P" are true
* [[Consequentia mirabilis]] - Clavius's Law

==References==
{{reflist}}

[[Category:Theorems in propositional logic]]
[[Category:Classical logic]]
[[Category:Principles]]

[[de:Ex falso quodlibet]]
[[es:Principio de explosión]]
[[it:Ex falso sequitur quodlibet]]
[[hu:Hamisból minden következik]]
[[nl:Ex falso sequitur quod libet]]
[[pt:Princípio de explosão]]
[[uk:Принцип вибуху]]
[[zh:爆炸原理]]

Principle of explosion

2012-12-04T23:17:16Z

Magmalex: /* See also */ added Clavius's Law

{{About|the ex falso quodlibet logical principle|the file tagging software|Ex Falso (software)}}

{{Confusing|date=December 2007}}
{{One source|date= December 2011}}

The '''principle of explosion''', ([[Latin]]: ''ex falso quodlibet'' or ''ex contradictione sequitur quodlibet'', "from a contradiction, anything follows") or the '''principle of Pseudo-Scotus''',{{Citation needed|date=September 2011}} is the law of [[classical logic]], [[intuitionistic logic|intuitionistic logic]] and similar logical systems, according to which any statement can be proven from a contradiction.<ref>Carnielli, W. and Marcos, J. (2001) [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.107.70 "Ex contradictione non sequitur quodlibet"] ''Proc. 2nd Conf. on Reasoning and Logic'' (Bucharest, July 2000)</ref> That is, once a contradiction has been asserted, any proposition (or its negation) can be inferred from it. In symbolic terms, the principle of explosion can be expressed in the following way (where "<math>\vdash</math>" symbolizes the relation of [[logical consequence]]):

: <math>\{ \phi , \lnot \phi \} \vdash \psi</math>
: ''or''
: <math>\bot \to P</math>.

This can be read as, "If one claims something is both true (<math>\phi\,</math>) and not true (<math>\lnot \phi</math>), one can logically derive ''any'' conclusion (<math>\psi</math>)."

==Arguments for explosion==

===An informal argument===

Consider two inconsistent statements, “All lemons are yellow” and "Not all lemons are yellow", and suppose for the sake of argument that both are simultaneously true. If that is the case, we can then prove anything, for instance that "Santa Claus exists", by using the following argument: We know "All lemons are yellow". And from this we can infer that “All lemons are yellow" or "Santa Claus exists” (or both) - we started with the assertion that "all lemons are yellow", so this expanded statement must be true. Since "all lemons are yellow or Santa Claus exists (or both)", yet we earlier asserted that "not all lemons are yellow", the only possibility remaining is that "Santa Claus exists".

In more formal terms, there are two basic kinds of argument for the principle of explosion, semantic and proof-theoretic.

===The semantic argument===

The first argument is ''semantic'' or ''[[model theory|model-theoretic]]'' in nature. A sentence <math>\psi</math> is a ''[[semantic consequence]]'' of a set of sentences <math>\Gamma</math> only if every model of <math>\Gamma</math> is a model of <math>\psi</math>. But there is no model of the contradictory set <math>\{\phi , \lnot \phi \}</math>. [[A fortiori]], there is no model of <math>\{\phi , \lnot \phi \}</math> that is not a model of <math>\psi</math>. Thus, vacuously, every model of <math>\{\phi , \lnot \phi \}</math> is a model of <math>\psi</math>. Thus <math>\psi</math> is a semantic consequence of <math>\{\phi , \lnot \phi \}</math>.

=== The proof-theoretic argument ===

The second type of argument is ''[[proof theory|proof-theoretic]]'' in nature. Consider the following derivations:

#<math>\phi \wedge \neg \phi\,</math>
#:assumption
#<math>\phi\,</math>
#:from (1) by [[conjunction elimination]]
#<math>\neg \phi\,</math>
#:from (1) by conjunction elimination
#<math>\phi \vee \psi\,</math>
#:from (2) by [[disjunction introduction]]
#<math>\psi\,</math>
#:from (3) and (4) by [[disjunctive syllogism]]
#<math>(\phi \wedge \neg \phi) \to \psi</math>
#:from (5) by [[conditional proof]] (discharging assumption 1)

This is just the symbolic version of the informal argument given above, with <math>\phi</math> standing for "all lemons are yellow" and <math>\psi</math> standing for "Santa Claus exists". From "all lemons are yellow and not all lemons are yellow" (1), we infer "all lemons are yellow" (2) and "not all lemons are yellow" (3); from "all lemons are yellow" (2), we infer "all lemons are yellow or Santa Claus exists" (4); and from "not all lemons are yellow" (3) and "all lemons are yellow or Santa Claus exists" (4), we infer "Santa Claus exists" (5). Hence, if all lemons are yellow and not all lemons are yellow, then Santa Claus exists.

Or:

#<math>\phi \wedge \neg \phi\,</math>
#:hypothesis
#<math>\phi\,</math>
#:from (1) by conjunction elimination
#<math>\neg \phi\,</math>
#:from (1) by conjunction elimination
#<math>\neg \psi\,</math>
#:hypothesis
#<math>\phi\,</math>
#:reiteration of (2)
#<math>\neg \psi \to \phi</math>
#:from (4) to (5) by [[deduction theorem]]
#<math>( \neg \phi \to \neg \neg \psi)</math>
#:from (6) by [[contraposition]]
#<math>\neg \neg \psi</math>
#:from (3) and (7) by [[modus ponens]]
#<math>\psi\,</math>
#:from (8) by [[double negation elimination]]
#<math>(\phi \wedge \neg \phi) \to \psi</math>
#:from (1) to (9) by deduction theorem

Or:

#<math>\phi \wedge \neg \phi\,</math>
#:assumption
#<math>\neg \psi\,</math>
#:assumption
#<math>\phi\,</math>
#:from (1) by conjunction elimination
#<math>\neg \phi\,</math>
#:from (1) by conjunction elimination
#<math>\neg \neg \psi\,</math>
#:from (3) and (4) by [[reductio ad absurdum]] (discharging assumption 2)
#<math>\psi\,</math>
#:from (5) by double negation elimination
#<math>(\phi \wedge \neg \phi) \to \psi</math>
#:from (6) by conditional proof (discharging assumption 1)

== Addressing the principle ==

[[Paraconsistent logic]]s have been developed that allow for sub-contrary forming operators. [[Formal semantics (logic)|Model-theoretic]] paraconsistent logicians often deny the assumption that there can be no model of <math>\{\phi , \lnot \phi \}</math> and devise semantical systems in which there are such models. Alternatively, they reject the idea that propositions can be classified as true or false. [[Proof-theoretic semantics|Proof-theoretic]] paraconsistent logics usually deny the validity of one of the steps necessary for deriving an explosion, typically including disjunctive syllogism, disjunction introduction, and [[reductio ad absurdum]].

==See also==
* [[Dialetheism]] – belief in the existence of true contradictions
* [[Law of excluded middle]] – every proposition is either true or not true
* [[Law of noncontradiction]] – no proposition can be both true and not true
* [[Paraconsistent logic]] – a [[modal logic]] used to address contradictions
* [[Paradox of entailment]] – a seeming paradox derived from the principle of explosion
* [[Reductio ad absurdum]] – concluding that a proposition is false because it produces a contradiction
* [[Trivialism]] – the belief that all statements of the form "P and not-P" are true
* [[Clavius's Law]]

==References==
{{reflist}}

[[Category:Theorems in propositional logic]]
[[Category:Classical logic]]
[[Category:Principles]]

[[de:Ex falso quodlibet]]
[[es:Principio de explosión]]
[[it:Ex falso sequitur quodlibet]]
[[hu:Hamisból minden következik]]
[[nl:Ex falso sequitur quod libet]]
[[pt:Princípio de explosão]]
[[uk:Принцип вибуху]]
[[zh:爆炸原理]]

Consequentia mirabilis

2012-12-04T23:14:01Z

Magmalex: deleted orphan template

{{Italic title}}
{{Refimprove|date=November 2006}}

'''''Consequentia mirabilis''''' ([[Latin]] for "admirable consequence"), also known as '''Clavius's Law''', is used in [[traditional logic|traditional]] and [[classical logic]] to establish the truth of a proposition from the [[Consistency proof|inconsistency]] of its negation.<ref>Sainsbury, Richard. ''Paradoxes''. Cambridge University Press, 2009, p. 128.</ref> It is thus similar to ''[[reductio ad absurdum]]'', but it can prove a proposition true using just its negation. It states that if a proposition is a consequence of its negation, then it is true, for consistency. It can thus be demonstrated without using any other principle, but that of consistency.

In formal notation:
<math> (\neg A \rightarrow A) \rightarrow A </math>

For example: "There is no truth" (not-A), but this statement implies that it is a truth (A), therefore "there is some truth" (then A is true). Or even: "Nothing exists" implies that there is this statement, so "something there."

The most famous example is perhaps the Cartesian ''cogito ergo sum'': Even if one can question the validity of the thinking, no one can deny the existence of thought.

Children have an uncanny appreciation for this law, and use it unwittingly on a regular basis: "I'm not talking to you" proves that, in saying this, the person is in fact talking to the other.

It is also known as Clavius' law, after the learned sixteenth century [[Jesuit]] [[Christopher Clavius]], one of the designers of the [[Gregorian calendar]], who first drew attention to the law in his commentary on [[Euclid]].

==See also==
*''[[Ex falso quodlibet]]''
*''[[Tertium non datur]]''

==References==
{{Reflist}}

{{DEFAULTSORT:Consequentia Mirabilis}}
[[Category:Theorems in propositional logic]]
[[Category:Latin logical phrases]]

[[de:Consequentia mirabilis]]
[[fr:Consequentia mirabilis]]
[[it:Consequentia mirabilis]]
[[sk:Claviov zákon]]

Comma category

2012-12-03T09:21:56Z

Magmalex: /*External links*/ Added

In mathematics, a '''comma category''' (a special case being a '''slice category''') is a construction in [[category theory]]. It provides another way of looking at [[morphism]]s: instead of simply relating objects of a [[Category (mathematics)|category]] to one another, morphisms become objects in their own right. This notion was introduced in 1963 by [[William Lawvere|F. W. Lawvere]], although the technique did not become generally known until many years later. Today, it has become particularly important to mathematicians, because several important mathematical concepts can be treated as comma categories. There are also certain guarantees about the existence of [[Limit (category theory)|limit]]s and [[colimit]]s in the context of comma categories. The name comes from the notation originally used by Lawvere, which involved the [[comma]] punctuation mark. Although standard notation has changed since the use of a comma as an operator is potentially confusing, and even Lawvere dislikes the uninformative term "comma category", the name persists.

==Definition==
The most general comma category construction involves two [[functor]]s with the same codomain. Often one of these will have domain '''1''' (the one-object one-morphism category). Some accounts of category theory consider these special cases only, but the term comma category is actually much more general.

===General form===
Suppose that <math>\mathcal{A}</math>, <math>\mathcal{B}</math>, and <math>\mathcal{C}</math> are categories, and <math>S</math> and <math>T</math> (for source and target) are [[functor]]s
:<math>\mathcal A \xrightarrow{\;\; S\;\;} \mathcal C\xleftarrow{\;\; T\;\;} \mathcal B</math>
We can form the comma category <math>(S \downarrow T)</math> as follows:
*The objects are all triples <math>(\alpha, \beta, f)</math> with <math>\alpha</math> an object in <math>\mathcal{A}</math>, <math>\beta</math> an object in <math>\mathcal{B}</math>, and <math>f : S(\alpha)\rightarrow T(\beta)</math> a morphism in <math>\mathcal{C}</math>.
*The morphisms from <math>(\alpha, \beta, f)</math> to <math>(\alpha', \beta', f')</math> are all pairs <math>(g, h)</math> where <math>g : \alpha \rightarrow \alpha'</math> and <math>h : \beta \rightarrow \beta'</math> are morphisms in <math>\mathcal A</math> and <math>\mathcal B</math> respectively, such that the following diagram [[commutative diagram|commutes]]:

<math>\begin{matrix} S(\alpha) & \xrightarrow{S(g)} & S(\alpha')\\ f \Bigg\downarrow & & \Bigg\downarrow f'\\ T(\beta) & \xrightarrow[T(h)]{} & T(\beta') \end{matrix}</math>

Morphisms are composed by taking <math>(g, h) \circ (g', h')</math> to be <math>(g \circ g', h \circ h')</math>, whenever the latter expression is defined. The identity morphism on an object <math>(\alpha, \beta, f)</math> is <math>(\mathrm{id}_{\alpha}, \mathrm{id}_{\beta})</math>.

===Slice category===
The first special case occurs when <math>\mathcal A = \mathcal{C}</math>, <math>S</math> is the [[identity functor]], and <math>\mathcal{B}=\textbf{1}</math> (the category with one object <math>*</math> and one morphism). Then <math>T(*) = A</math> for some object <math>A</math> in <math>\mathcal{C}</math>. In this case, the comma category is written <math>(\mathcal{C} \downarrow A)</math>, and is often called the ''slice category'' over <math>A</math> or the category of ''objects over <math>A</math>''. The objects <math>(\alpha, *, f)</math> can be simplified to pairs <math>(\alpha, f)</math>, where <math>f : \alpha \rightarrow A</math>. Sometimes, <math>f</math> is denoted <math>\pi_\alpha</math>. A morphism from <math>(B, \pi_B)</math> to <math>(B', \pi_{B'})</math> in the slice category is then an arrow <math>g : B \rightarrow B'</math> making the following diagram commute:

<div style="text-align: center;">[[Image:CommaCategory-01.png]]</div>

===Coslice category===
The [[Dual (category theory)|dual]] concept to a slice category is a coslice category. Here, <math>S</math> has domain '''1''' and <math>T</math> is an identity functor. In this case, the comma category is often written
<math>(A\downarrow \mathcal{C})</math>, where <math>A</math> is the object of <math>\mathcal{C}</math> selected by <math>S</math>. It is called the ''coslice category'' with respect to <math>A</math>, or the category of ''objects under <math>A</math>''. The objects are pairs <math>(B, i_B)</math> with <math>i_B : A \rightarrow B</math>. Given <math>(B, i_B)</math> and <math>(B', i_{B'})</math>, a morphism in the coslice category is a map <math>h : B \rightarrow B'</math> making the following diagram commute:

<div style="text-align: center;">[[Image:CommaCategory-02.png]]</div>

===Arrow category===
<math>S</math> and <math>T</math> are [[identity functor]]s on <math>\mathcal{C}</math> (so <math>\mathcal{A} = \mathcal{B} = \mathcal{C}</math>). In this case, the comma category is the [[arrow category]] <math>\mathcal{C}^\rightarrow</math>. Its objects are the morphisms of <math>\mathcal{C}</math>, and its morphisms are commuting squares in <math>\mathcal{C}</math>.<ref name="joy">{{cite book | last = Adámek | first = Jiří | coauthors = Horst Herrlich, and George E. Strecker | year = 1990 | url = http://katmat.math.uni-bremen.de/acc/acc.pdf | title = Abstract and Concrete Categories | publisher = John Wiley & Sons | isbn = 0-471-60922-6}}</ref>

===Other variations===
In the case of the slice or coslice category, the identity functor may be replaced with some other functor; this yields a family of categories particularly useful in the study of [[adjoint functor]]s. For example, if <math>T</math> is the [[forgetful functor]] mapping an [[abelian group]] to its underlying [[Set (mathematics)|set]], and <math>s</math> is some fixed set (regarded as a functor from '''1'''), then the comma category <math>(s \downarrow T)</math> has objects that are maps from <math>s</math> to a set underlying a group. This relates to the left adjoint of <math>T</math>, which is the functor that maps a set to the [[free abelian group]] having that set as its basis. In particular, the initial object of <math>(s \downarrow T)</math> is the canonical injection <math>s\rightarrow T(G)</math>, where <math>G</math> is the free group generated by <math>s</math>.

An object of <math>(s \downarrow T)</math> is called a ''morphism from <math>s</math> to <math>T</math>'' or a ''<math>T</math>-structured arrow with domain <math>s</math>'' in.<ref name="joy" /> An object of <math>(S \downarrow t)</math> is called a ''morphism from <math>S</math> to <math>t</math>'' or a ''<math>S</math>-costructured arrow with codomain <math>t</math>'' in.<ref name="joy" />

Another special case occurs when both <math>S</math> and <math>T</math> are functors with domain '''1'''. If <math>S(*)=A</math> and <math>T(*)=B</math>, then the comma category <math>(S \downarrow T)</math>, written <math>(A\downarrow B)</math>, is the [[discrete category]] whose objects are morphisms from <math>A</math> to <math>B</math>.

==Properties==
For each comma category there are forgetful functors from it.
* Domain functor, <math>S\downarrow T \to \mathcal A</math>, which maps:
** objects: <math>(\alpha, \beta, f)\mapsto \alpha</math>;
** morphisms: <math>(g, h)\mapsto g</math>;
* Codomain functor, <math>S\downarrow T \to \mathcal B</math>, which maps:
** objects: <math>(\alpha, \beta, f)\mapsto \beta</math>;
** morphisms: <math>(g, h)\mapsto h</math>.{{Citation needed|date=July 2011}}

==Examples of use==
===Some notable categories===
Several interesting categories have a natural definition in terms of comma categories.
* The category of [[pointed set]]s is a comma category, <math>\scriptstyle {(\bull \downarrow \mathbf{Set})}</math> with <math>\scriptstyle {\bull}</math> being (a functor selecting) any [[singleton set]], and <math>\scriptstyle {\mathbf{Set}}</math> (the identity functor of) the [[category of sets]]. Each object of this category is a set, together with a function selecting some element of the set: the "basepoint". Morphisms are functions on sets which map basepoints to basepoints. In a similar fashion one can form the category of [[pointed space]]s <math>\scriptstyle {(\bull \downarrow \mathbf{Top})}</math>.

* The category of [[Graph (mathematics)|graphs]] is <math>\scriptstyle {(\mathbf{Set} \downarrow D)}</math>, with <math>\scriptstyle {D : \mathbf{Set} \rightarrow \mathbf{Set}}</math> the functor taking a set <math>s</math> to <math>t \times t</math>. The objects <math>(a, b, f)</math> then consist of two sets and a function; <math>a</math> is an indexing set, <math>b</math> is a set of nodes, and <math>f : a \rightarrow (b \times b)</math> chooses pairs of elements of <math>b</math> for each input from <math>a</math>. That is, <math>f</math> picks out certain edges from the set <math>b \times b</math> of possible edges. A morphism in this category is made up of two functions, one on the indexing set and one on the node set. They must "agree" according to the general definition above, meaning that <math>(g, h) : (a, b, f) \rightarrow (a', b', f')</math> must satisfy <math>f' \circ g = T(h) \circ f</math>. In other words, the edge corresponding to a certain element of the indexing set, when translated, must be the same as the edge for the translated index.

* Many "augmentation" or "labelling" operations can be expressed in terms of comma categories. Let <math>S</math> be the functor taking each graph to the set of its edges, and let <math>A</math> be (a functor selecting) some particular set: then <math>(S \downarrow A)</math> is the category of graphs whose edges are labelled by elements of <math>A</math>. This form of comma category is often called ''objects <math>S</math>-over <math>A</math>'' - closely related to the "objects over <math>A</math>" discussed above. Here, each object takes the form <math>(B, \pi_B)</math>, where <math>B</math> is a graph and <math>\pi_B</math> a function from the edges of <math>B</math> to <math>A</math>. The nodes of the graph could be labelled in essentially the same way.

* A category is said to be ''locally cartesian closed'' if every slice of it is [[cartesian closed]] (see above for the notion of ''slice''). Locally cartesian closed categories are the [[classifying category|classifying categories]] of [[dependent type theory|dependent type theories]].

===Limits and universal morphisms===
[[Limit (category theory)|Colimits]] in comma categories may be "inherited". If <math>\mathcal{A}</math> and <math>\mathcal{B}</math> are cocomplete, <math>S : \mathcal{A} \rightarrow \mathcal{C}</math> is a cocontinuous functor, and <math>T : \mathcal{B} \rightarrow \mathcal{C}</math> another functor (not necessarily cocontinuous), then the comma category <math>(S \downarrow T)</math> produced will also be cocomplete{{Cn|date=February 2012}}. For example, in the above construction of the category of graphs, the category of sets is cocomplete, and the identity functor is cocontinuous: so graphs are also cocomplete - all (small) colimits exist. This result is much harder to obtain directly.

If <math>\mathcal{A}</math> and <math>\mathcal{B}</math> are complete, and both <math>S : \mathcal{A} \rightarrow \mathcal{C}</math> and <math>T : \mathcal{B} \rightarrow \mathcal{C}</math> are [[continuous functor]]s,<ref>See I. 2.16.1 in Francis Borceux (1994), ''Handbook of Categorical Algebra 1'', Cambridge University Press. ISBN 0-521-44178-1.</ref> then the comma category <math>(S \downarrow T)</math> is also complete, and the projection functors <math>(S\downarrow T) \rightarrow \mathcal{A}</math> and <math>(S\downarrow T) \rightarrow \mathcal{B}</math> are limit preserving.

The notion of a [[Universal property|universal morphism]] to a particular colimit, or from a limit, can be expressed in terms of a comma category. Essentially, we create a category whose objects are cones, and where the limiting cone is a terminal object; then, each universal morphism for the limit is just the morphism to the terminal object. This works in the dual case, with a category of cocones having an initial object. For example, let <math>\mathcal{C}</math> be a category with <math>F : \mathcal{C} \rightarrow \mathcal{C} \times \mathcal{C}</math> the functor taking each object <math>c</math> to <math>(c, c)</math> and each arrow <math>f</math> to <math>(f, f)</math>. A universal morphism from <math>(a, b)</math> to <math>F</math> consists, by definition, of an object <math>(c, c)</math> and morphism <math>\rho : (a, b) \rightarrow (c, c)</math> with the universal property that for any morphism <math>\rho' : (a, b) \rightarrow (d, d)</math> there is a unique morphism <math>\sigma : c \rightarrow d</math> with <math>F(\sigma) \circ \rho = \rho'</math>. In other words, it is an object in the comma category <math>((a, b) \downarrow F)</math> having a morphism to any other object in that category; it is initial. This serves to define the [[coproduct]] in <math>\mathcal{C}</math>, when it exists.

===Adjunctions===
Lawvere showed that the functors <math>F : \mathcal{C} \rightarrow \mathcal{D}</math> and <math>G : \mathcal{D} \rightarrow \mathcal{C}</math> are [[adjoint functors|adjoint]] if and only if the comma categories <math>(F \downarrow id_\mathcal{D})</math> and <math>(id_\mathcal{C} \downarrow G)</math>, with <math>id_\mathcal{D}</math> and <math>id_\mathcal{C}</math> the identity functors on <math>\mathcal{C}</math> and <math>\mathcal{D}</math> respectively, are isomorphic, and equivalent elements in the comma category can be projected onto the same element of <math>\mathcal{C} \times \mathcal{D}</math>. This allows adjunctions to be described without involving sets, and was in fact the original motivation for introducing comma categories.

===Natural transformations===
If the domains of <math>S, T</math> are equal, then the diagram which defines morphisms in <math>S\downarrow T</math> with <math>\alpha=\beta, \alpha'=\beta', g=h</math> is identical to the diagram which defines a [[natural transformation]] <math>S\to T</math>. The difference between the two notions is that a natural transformation is a particular collection of morphisms of type of the form <math>S(\alpha)\to T(\alpha)</math>, while objects of the comma category contains ''all'' morphisms of type of such form. A functor to the comma category selects that particular collection of morphisms. This is described succinctly by an observation by Huq{{Citation needed|date=July 2011}} that a natural transformation <math>\eta:S\to T</math>, with <math>S, T:\mathcal A \to \mathcal C</math>, corresponds to a functor <math>\mathcal A \to (S\downarrow T)</math> which maps each object <math>\alpha</math> to <math>(\alpha, \alpha, \eta_\alpha)</math> and maps each morphism <math>g</math> to <math>(g, g)</math>. This is a [[bijection|bijective]] correspondence between natural transformations <math>S\to T</math> and functors <math>\mathcal A \to (S\downarrow T)</math> which are [[section (category theory)|sections]] of both forgetful functors from <math>S\downarrow T</math>.

==References==
<references />
*{{nlab|id=comma+category|title=Comma category}}

{{DEFAULTSORT:Comma Category}}
[[Category:Category theory]]
[[Category:Category-theoretic categories]]

[[de:Kommakategorie]]

==External links==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* J. Adamek, H. Herrlich, G. Stecker, [http://katmat.math.uni-bremen.de/acc/acc.pdf Abstract and Concrete Categories-The Joy of Cats]
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

Kon-Tiki expedition

2012-12-02T03:03:52Z

Magmalex: /* Later recreations of Kon-Tiki */

{{about|the raft used by Thor Heyerdahl to sail across the Pacific||Kontiki (disambiguation)}}
[[File:Kon-Tiki.jpg|thumb|250px|Kon-Tiki, 1947]]
'''''Kon-Tiki''''' was the [[raft]] used by Norwegian explorer and writer [[Thor Heyerdahl]] in his 1947 expedition across the Pacific Ocean from South America to the [[Polynesia|Polynesian islands]]. It was named after the [[Inca]] sun god, [[Viracocha]], for whom "Kon-Tiki" was said to be an old name. ''[[The Kon-Tiki Expedition: By Raft Across the South Seas|Kon-Tiki]]'' is also the name of Heyerdahl's book and the [[Academy Award for Best Documentary Feature|Academy Award-winning]] [[Kon-Tiki (1950 film)|documentary film]] chronicling his adventures.

Heyerdahl believed that people from South America could have settled Polynesia in [[pre-Columbian]] times, although most anthropologists now believe they did not.<ref>[[Wade Davis]], ''The Wayfinders: Why Ancient Wisdom Matters in the Modern World'', Crawley: University of Western Australia Publishing, p.46.</ref><ref>{{cite web|author=Andrew Lawler |url=http://www.sciencemag.org/content/328/5984/1344.summary |title=Andrew Lawler, ''Beyond Kon-Tiki: Did Polynesians Sail to South America'', Journal ''Science'' Vol. 328 no. 5984 pp. 1344–1347 11 June 2010 |publisher=Sciencemag.org |date= |accessdate=2011-11-09}}</ref><ref>{{cite web|author=Andrew Lawler |url=http://www.sciencemag.org/content/328/5984/1346.short |title=Andrew Lawler, ''Changing Time in the South Pacific'', Journal ''Science'' Vol. 328 no. 5984 p. 1346 11 June 2010 |publisher=Sciencemag.org |date=2010-06-11 |accessdate=2011-11-09}}</ref> His aim in mounting the ''Kon-Tiki'' expedition was to show, by using only the materials and technologies available to those people at the time, that there were no technical reasons to prevent them from having done so. Although the expedition carried some modern equipment, such as a radio, watches, charts, sextant, and metal knives, Heyerdahl argued they were incidental to the purpose of proving that the raft itself could make the journey.

The ''Kon-Tiki'' expedition was funded by private loans, along with donations of equipment from the [[United States Army]]. Heyerdahl and a small team went to [[Peru]], where, with the help of dockyard facilities provided by the Peruvian authorities, they constructed the raft out of [[Ochroma pyramidale|balsa]] logs and other native materials in an indigenous style as recorded in illustrations by Spanish [[conquistadores]]. The trip began on April 28, 1947. Heyerdahl and five companions sailed the raft for 101 days over 6900 km (4,300 miles) across the Pacific Ocean before smashing into a [[reef]] at [[Raroia]] in the [[Tuamotu Islands]] on August 7, 1947. The crew made successful landfall and all returned safely.

Thor Heyerdahl's book about his experience became a bestseller. It was published in 1948 as ''The Kon-Tiki Expedition: By Raft Across the South Seas'', later reprinted as ''Kon-Tiki: Across the Pacific in a Raft''. A documentary [[motion picture]] about the expedition, also called ''Kon-Tiki'' was produced from a write-up and expansion of the crew's filmstrip notes and won an [[Academy Award for Best Documentary Feature|Academy Award]] in 1951. It was directed by [[Thor Heyerdahl]] and edited by [[Olle Nordemar]]. The voyage was also chronicled in the documentary TV-series ''The Kon-Tiki Man: The Life and Adventures of Thor Heyerdahl'', directed by Bengt Jonson.<ref>[http://www.kon-tiki.no/Events/indexold.html ''The Kon-Tiki Man'' episode breakdown]{{dead link|date=November 2011}}</ref>

The original ''Kon-Tiki'' raft is now on display in the [[Kon-Tiki Museum]] in [[Oslo]].

== Crew ==

''Kon-Tiki'' had six men on its crew, and a pet [[parrot]] named Lorita. Crew members included [[Thor Heyerdahl]], Erik Hesselberg, [[Bengt Danielsson]], [[Knut Haugland]], [[Torstein Raaby]], and Herman Watzinger.<ref>{{cite book |url= http://books.google.co.uk/books?id=2ysYAQAAMAAJ&q=Watzinger+Raaby+Haugland&dq=Watzinger+Raaby+Haugland&hl=en&sa=X&ei=gPF9T7OUD6WX0QWKn_maDg&ved=0CDgQ6AEwAQ |title=The Kon-Tiki Expedition |first=Thor |last=Thor Heyerdahl |publisher=Rand McNally|year= 1968 |accessdate=5 April 2012}}</ref> All were Norwegian except for Bengt Danielsson, a Swede. Thor Heyerdahl (1914–2002) was the expedition leader. He was also the author of the book and the narrator of the story. Heyerdahl had studied the ancient people of South America and Polynesia and believed that there was a link between the two. Erik Hesselberg (1914–1972) was the navigator and artist. He painted the large Kon-Tiki figure on the raft's sail. His delightful children's book "Kon-Tiki and I" appeared in Norwegian in 1949 and has since been published in more than 15 languages. Bengt Danielsson (1921–1997) took on the role of steward, in charge of supplies and daily rations. Danielsson was a Swedish sociologist interested in [[human migration|human migration theory]]. He also served as translator, as he was the only member of the crew who spoke Spanish. He was also a voracious reader; his box aboard the raft contained many books. Knut Haugland (1917–2009) was a radio expert, decorated by the British in World War II for actions in the [[Norwegian heavy water sabotage]] that stalled what were believed to be Germany's plans to develop an [[atomic bomb]]. Torstein Raaby (1918–1964) was also in charge of radio transmissions. He gained radio experience while hiding behind German lines during WWII, spying on the German battleship ''[[German battleship Tirpitz|Tirpitz]]''. His secret radio transmissions eventually helped guide in Allied bombers to sink the ship. Herman Watzinger (1910–1986) was an engineer whose area of expertise was in technical measurements. He was the first to join Heyerdahl for the trip. He collected and recorded all sorts of data on the voyage. Much of what he recorded, such as weather data, was sent back to various people, since this area of the ocean was largely understudied.

==Construction==
{{unreferenced section|date=August 2012}}
{{Undue precision|section}}
[[File:Kon-Tiki inside.jpg|thumb|left|The raft in the [[Kon-Tiki Museum]], Oslo]]
The main body of the float was composed of nine [[balsa (tree)|balsa]] tree trunks up to 13.7 metres (45 ft) long, 60 cm (2 ft) in diameter, lashed together with 3.175 cm (1¼ inch) [[hemp]] ropes. Cross-pieces of balsa logs 5.5 m (18 ft) long and 30 cm (1 ft) in diameter were lashed across the logs at 1 m (3 ft) intervals to give lateral support. [[Pine]] splashboards clad the bow, and lengths of pine 2.5 cm (1 inch) thick and 60 cm (2 ft) wide were wedged between the balsa logs and used as [[centerboard]]s.

The main mast was made of lengths of mangrove wood lashed together to form an A-frame 8.8 m (29 ft) high. Behind the main-mast was a cabin of plaited bamboo 4.2 m (14 ft) long and 2.4 m (8 ft) wide was built about 1.21–1.51 m (4–5 feet) high, and roofed with banana leaf thatch. At the stern was a 5.8 m (19 ft) long steering oar of mangrove wood, with a blade of fir.
The main sail was 4.6 m by 5.5 m (15 by 18 feet) on a yard of bamboo stems lashed together. Photographs also show a top-sail above the main sail, and also a mizzen-sail, mounted at the stern.

The raft was partially decked in split bamboo. The main spars were a laminate of wood and reeds and Heyerdahl tested more than twenty different composites before settling on one that proved an effective compromise between bulk and torsional rigidity. No metal was used in the construction.

==Supplies==
''Kon-Tiki'' carried 275 gallons of drinking water in 56 water cans, as well as a number of sealed bamboo rods. The purpose stated by Heyerdahl for carrying modern and ancient containers was to test the effectiveness of ancient water storage. For food ''Kon-Tiki'' carried 200 [[coconut]]s, [[sweet potato]]es, [[calabash|bottle gourds]] and other assorted fruit and roots. The [[U.S. Army Quartermaster Corps]] provided [[field ration]]s, tinned food and survival equipment. In return, the ''Kon-Tiki'' explorers reported on the quality and utility of the provisions. They also caught plentiful numbers of fish, particularly [[flying fish]], "[[mahi-mahi|dolphin fish]]", [[yellowfin tuna]], [[bonito]] and [[shark]].

==Communications==
The expedition carried an [[amateur radio]] station with the call sign of LI2B operated by former [[World War II]] [[Norwegian Resistance|Norwegian underground]] radio operators Knut Haugland and Torstein Raaby.<ref>{{cite journal
|journal=QST
|title=Kon-Tiki Communications - Well Done!
|year=1947
|month=December
|author=Anonymous
|pages=69, 143-148
|publisher=The [[American Radio Relay League]]
}}</ref> Haugland and Raaby maintained regular communication with a number of American, Canadian, and South American stations that relayed ''Kon Tiki's'' status to the Norwegian Embassy in Washington, D.C. On August 5, Haugland made contact with a station in Oslo, Norway, 10,000 miles away.<ref name="arrl">[http://www.arrl.org/news/features/2003/03/05/1/?nc=1 ''An LA, as in Norway, Story, by Bob Merriam, W1NTE''], March 5, 2003</ref><ref>[http://www.arrl.org/news/stories/2002/04/24/1/ ''Thor Heyerdahl of Kon-Tiki fame dies at 87''], April 24, 2002</ref> ''Kon Tiki's'' transmitters were powered by batteries and a hand-cranked generator and operated on the [[40-meter band|40-]], the [[20-meter band|20-meter]] band, the [[10-meter band]], and the [[6-meter band]]. Each unit was water resistant and included 2E30 [[vacuum tubes]] providing 10 [[watt]]s of [[radio frequency|RF]] input. A German Mark V [[transceiver]] was used as a backup unit.<ref name="arrl" />

The radio receiver used throughout the voyage was a [[National Radio Company]] NC-173, once requiring a thorough drying out after being soaked during a shipwreck.<ref>{{cite web|url=http://oak.cats.ohiou.edu/~postr/bapix/NC173.htm |title=Boatanchor Pix, National NC-173 |publisher=Oak.cats.ohiou.edu |date= |accessdate=2011-11-09}}</ref> An "all well, all well" message was sent via LI2B to notify would-be rescuers of the crew's safety.<ref>[http://www.arrl.org/news/stories/2002/04/24/1/ ''Thor Heyerdahl of Kon-Tiki fame dies at 87''] April 24, 2002,</ref>

The call sign LI2B was used by Heyerdahl again in 1969–70, when he built a papyrus reed raft and sailed from Morocco to Barbados in an attempt to show a possible link between the civilization of ancient Egypt and the New World.<ref>{{cite book|author=Thor Heyerdahl|title=The Ra Expeditions|edition=English Edition|year=1971
|location=New York|publisher=Doubleday and Company|page=270}}</ref>

==The voyage==
''Kon-Tiki'' left [[Callao]], [[Peru]], on the afternoon of April 28, 1947. To avoid coastal traffic it was initially towed 50 miles out by the Fleet Tug ''Guardian Rios'' of the [[Peruvian Navy]], then sailed roughly west carried along on the [[Humboldt Current]].<ref>{{cite book |url= http://books.google.co.uk/books?id=LxMNAAAAYAAJ&q=Guardian+Rios&dq=Guardian+Rios&hl=en&sa=X&ei=ogJ-T--xFejK0QWt342hDg&redir_esc=y |title=Kon-Tiki: across the Pacific by raft|page=98 |first=Thor |last= Heyerdahl |publisher=Rand McNally |year= 1984|accessdate=5 April 2012}}</ref>

The crew's first sight of land was the atoll of [[Puka-Puka]] on July 30. On August 4, the 97th day after departure, ''Kon-Tiki'' reached the Angatau atoll. The crew made brief contact with the inhabitants of [[Fangatau|Angatau Island]], but were unable to land safely. Calculations made by Heyerdahl before the trip had indicated that 97 days was the minimum amount of time required to reach the Tuamotu islands, so that the encounter with Angatau showed that they had made good time.

On August 7, the voyage came to an end when the raft struck a reef and was eventually beached on an uninhabited islet off [[Raroia]] Island in the [[Tuamotus|Tuamotu]] group. The team had travelled a distance of around 3,770 nautical miles (c. {{convert|6980|km|abbr=on}}) in 101 days, at an average speed of 1.5 knots.

After spending a number of days alone on the tiny islet, the crew were greeted by men from a village on a nearby island who arrived in canoes, having seen washed-up flotsam from the raft. The crew were taken back to the native village, where they were feted with traditional dances and other festivities. Finally the crew were taken off Raroia to [[Tahiti]] by the French schooner ''Tamara'', with the salvaged ''Kon-Tiki'' in tow.

== Anthropology ==
[[Image:Kneeled moai Easter Island.jpg|thumb|A [[moai]] bearing resemblances to statues around [[Lake Titicaca]] in South America]]
Heyderhal believed that the original inhabitants of Easter Island were the migrants from Peru. He argued that the monumental statues known as [[moai]] resembled sculptures more typical of pre-Columbian Peru than any Polynesian designs. He believed that the Easter Island myth of a power struggle between two peoples called the [[Hanau epe]] and [[Hanau momoko]] was a memory of conflicts between the original inhabitants of the island and a later wave of Native Americans from the Northwest coast, eventually leading to the annihilation of the Hanau epe and the destruction of the island's culture and once-prosperous economy.<ref>Heyderdahl, Thor. ''Easter Island - The Mystery Solved''. Random House New York 1989.</ref><ref name = "Rose">Robert C. Suggs, "Kon-Tiki", in Rosemary G. Gillespie, D. A. Clague (eds), ''Encyclopedia of Islands'', University of California Press, 2009, pp.515-16.</ref>

Most historians consider that the Polynesians from the west were the original inhabitants and that the story of the Hanau epe is either pure myth, or a memory of internal tribal or class conflicts.<ref>William R. Long, "Does 'Rapa Nui' Take Artistic License Too Far?",Los Angeles Times, Friday August 26, 1994, p.21.</ref><ref>John Flenley, Paul G. Bahn, The Enigmas of Easter Island: Island on the Edge, Oxford University Press, 2003, pp.76; 154.</ref><ref>Steven R. Fischer, Island at the End of the World: The Turbulent History of Easter Island, Reaktion Books, 2005, p.42.</ref> However, in 2011 Professor Erik Thorsby of the [[University of Oslo]] presented DNA evidence to the [[Royal Society]] which whilst agreeing with the west origin also identified a distinctive but smaller genetic contribution from South America.<ref>{{cite news|url=http://www.telegraph.co.uk/science/science-news/8582150/Kon-Tiki-explorer-was-partly-right-Polynesians-had-South-American-roots.html|title=Kon-Tiki explorer was partly right – Polynesians had South American roots|publisher=Daily Telegraph|author=Richard Alleyne|date=17 Jun 2011|accessdate=17 Jun 2011}}</ref>

== Later recreations of Kon-Tiki ==
In 1954 [[William Willis (sailor)|William Willis]] sailed alone on a raft from [[Peru]] to [[American Samoa]], successfully completing the journey.<ref>Willis, William (1955). The Epic Voyage of the Seven Little Sisters: A 6700 Mile Voyage Alone Across the Pacific. London: Hutchinson</ref> He sailed 6,700 miles, which was 2,200 miles farther than Kon-Tiki. In the following year the Czech explorer and adventurer [[Eduard Ingris]] attempted to recreate the Kon-Tiki expedition on a balsa raft called ''[[Kantuta Expeditions|Kantuta]]''. His first expedition, ''Kantuta I'', took place in 1955-1956 and led to failure. In 1959 Ingris built a new balsa raft, ''Kantuta II'', and tried to repeat the previous expedition. The second expedition was a success. Ingris was able to cross the Pacific Ocean on the balsa raft from Peru to Polynesia.

[[File:Tangaroa 1.jpg|thumb|175px|''Tangaroa'' anchored by [[Stavern]], Norway]]
On April 28, 2006, a Norwegian team attempted to duplicate the ''Kon-Tiki'' voyage using a newly built raft, the ''Tangaroa'', named after the Māori sea-god [[Tangaroa]]. Again based on records of ancient vessels, this raft used a relatively sophisticated [[square sail]] that allowed sailing into the wind, or [[Tacking (sailing)|tacking]]. It was {{convert|16|m|abbr=on}} high by {{convert|8|m|abbr=on}} wide. The raft also included a set of modern [[navigation]] and [[communication]] equipment, including [[Photovoltaic module|solar panel]]s, [[portable computers]], and [[desalination]] equipment.<ref>Equipment on the Tangaroa included GPS (Global Positioning System), F-77 satellite antenna, AUS (Automatic Identification System), six solar panels to generate electricity, wind generators, desalination equipment, telephone, internet, 3 MAC iBook computers, DVD player and an iPod. [http://azer.com/aiweb/categories/magazine/ai144_folder/144_graphics/kon_tiki_tangaroa_chart.jpg Azerbaijan International, Vol. 14:4] (Winter 2006), p. 35</ref> The crew posted to their website.<ref>{{cite web|url=http://www.tangaroa.no |title=www.tangaroa.no |publisher=www.tangaroa.no |date= |accessdate=2011-11-09}}</ref>

The crew of six was led by [[Torgeir Higraff]], and included [[Olav Heyerdahl]], grandson of [[Thor Heyerdahl]]. The voyage was completed successfully in July 2006. A DVD Documentary: [http://videomaker.no/norwegian/ "The Tangaroa Expedition" (Ekspedisionen Tangaroa)] by Videomaker (Norwegian), 2007. By Photographer Anders Berg and Jenssen. 58 minutes (English, Norwegian, Swedish, Spanish).

On January 30, 2011 An-Tiki, a raft modeled after ''Kon-Tiki'' began a 3,000 mile, 70-day journey across the Atlantic Ocean from the Canary Islands to the island of [[Eleuthera]] in the Bahamas.<ref>{{cite web|url=http://www.eleutheranews.com/local/1159.html |title=The Eleutheran – Eleuthera News, Sport and much more from Eleuthera – The tale of An-Tiki – One raft, four ‘mature’ adventurers and a very big ocean! |publisher=Eleutheranews.com |date= |accessdate=2011-11-09}}</ref> The expedition was piloted by four "‘mature’ and intrepid gentlemen, aged from 56 to 84 years", led by [[Anthony Smith (explorer)|Anthony Smith]].<ref>{{cite web|url=http://www.eleutheranews.com/local/1232.html |title=The Eleutheran – Eleuthera News, Sport and much more from Eleuthera – The An-Tiki Dream Turns into Reality |publisher=Eleutheranews.com |date= |accessdate=2011-11-09}}</ref> The trip was designed to commemorate the journey in an open boat of survivors from the British steamship ''Anglo-Saxon'', sunk by the German cruiser ''[[German auxiliary cruiser Widder|Widder]]'' in 1940. The raft ended its voyage in the Caribbean island of St Maarten, completing its trip to Eleuthera in the following year with Smith and a new crew.<ref>[http://www.telegraph.co.uk/travel/activityandadventure/9247834/Voyage-to-the-brink-of-death.html# Anthony Smith, "Voyage to the Brink of Death", ''The Daily Telegraph'', 06 May 2012],</ref>

== See also ==
* [[Experimental archaeology]]
* [[Polynesian navigation]]
* [[Tupac Inca Yupanqui]]

==References==
{{reflist|2}}

;Bibliography
* Heyerdahl, Thor (1950). [http://www.archive.org/details/kontikiacrossthe012568mbp ''Kon-Tiki'']. Rand McNally & Company. At [[Internet Archive]].
* Hesselberg, Erik (1950). ''Kon-Tiki and I : illustrations with text, begun on the Pacific on board the raft "Kon-Tiki" and completed at "Solbakken" in Borre.'' Allen & Unwin
* Andersson, Axel (2010) ''A Hero for the Atomic Age: Thor Heyerdahl and the Kon-Tiki Expedition'' (Peter Lang) ISBN 978-1-906165-31-4

== External links ==
*[http://www.kon-tiki.no/ Kon-Tiki Museum]
*[http://oak.cats.ohiou.edu/~postr/bapix/NC173.htm National NC-173 receiver]
*[http://www.azer.com/aiweb/categories/magazine/ai144_folder/144_graphics/kon_tiki_tangaroa_chart.jpg Quick Facts: Comparing the Two Rafts: Kon-Tiki and Tangaroa] Azerbaijan International, Vol 14:4 (Winter 2006)
*[http://www.azer.com/aiweb/categories/magazine/ai144_folder/144_articles/144_tangaroa.html Testing Heyerdahl's Theories about Kon-Tiki 60 Years Later: Tangaroa Pacific Voyage (Summer 2006)] Azerbaijan International, Vol 14:4 (Winter 2006)
*[http://www.personal.psu.edu/pjc12/Kon-Tiki%20in%20Reverse--The%20Tahiti-Nui%20Expedition.htm Kon-Tiki in Reverse: The Tahiti-Nui Expedition]
*[http://webtv.tv2.no/webtv/sumo/?treeId=444033&progId=185694 TV2Sumo WebTV programme "Ekspedisjonen Tangaroa" (Tangaroa Expedition) – Norsk]
* [http://www.librarything.com/work/1469859 Acali 1973 – expedition by raft across Atlantic] Librarything, 2007
* [http://www.personal.psu.edu/pjc12/From%20China%20to%20America--The%20Hsu-Fu%20Expedition.htm Hsu-Fu 1993 – bamboo raft across Pacific (west to east)] personal.psu.edu

{{Thor Heyerdahl}}

[[Category:Pre-Columbian trans-oceanic contact]]
[[Category:Individual sailing vessels]]
[[Category:Ships preserved in museums]]
[[Category:Replica ships]]
[[Category:Replications of ancient voyages]]
[[Category:Pacific expeditions]]
[[Category:Sailing expeditions]]
[[Category:Thor Heyerdahl]]

[[ar:رحلة الكونتيكي الاستكشافية]]
[[be:Кон-Цікі]]
[[bg:Кон-Тики]]
[[bs:Kon-Tiki]]
[[ca:Kon-Tiki]]
[[cs:Kon-Tiki]]
[[da:Kon-Tiki]]
[[de:Kon-Tiki]]
[[es:Kon-tiki (expedición)]]
[[eo:Kon-Tiki]]
[[eu:Kon-Tiki]]
[[fr:Kon-Tiki]]
[[hi:कॉन-टिकी]]
[[hr:Kon-Tiki]]
[[it:Kon-Tiki]]
[[he:קון טיקי]]
[[ka:კონ-ტიკი]]
[[lv:Kon-Tiki]]
[[lt:Kon Tikis]]
[[hu:Kon-Tiki (hajó)]]
[[ml:കോൺ-ടിക്കി]]
[[nl:Kon-Tiki (boek)]]
[[ja:コンティキ号]]
[[no:«Kon-Tiki»]]
[[nn:Kon-Tiki-ekspedisjonen]]
[[pl:Wyprawa Kon-Tiki]]
[[pt:Expedição Kon-Tiki]]
[[ru:Кон-Тики]]
[[si:කොන්-ටිකි]]
[[fi:Kon-Tiki]]
[[sv:Kon-Tiki]]
[[tl:Kon-Tiki]]
[[tr:Kon-Tiki]]
[[zh:康提基号]]

Cyclic group

2012-11-30T10:25:59Z

Magmalex: /* Representation */ Added "regular"

{{Groups}}
In algebra, a '''cyclic group''' is a [[group (mathematics)|group]] that is [[generating set of a group|generated]] by a single element, in the sense that every element of the group can be written as a power of some particular element ''g'' in multiplicative notation, or as a multiple of ''g'' in additive notation. This element ''g'' is called a "[[Generating set of a group|generator]]" of the group. Any infinite cyclic group is [[isomorphic]] to '''Z''', the integers with addition as the group operation. Any finite cyclic group of order ''n'' is isomorphic to '''Z'''/''n'''''Z''', the integers modulo n with addition as the group operation.

==Definition==
[[File:Cyclic group.svg|right|thumb|150px|The six 6th complex roots of unity form a cyclic group under multiplication. ''z'' is a primitive element, but ''z''2 is not, because the odd powers of ''z'' are not a power of ''z''2.]]
A group ''G'' is called cyclic if there exists an element ''g'' in ''G'' such that ''G'' = <''g''> = { ''g''''n'' | ''n'' is an integer }. Since any group generated by an element in a group is a subgroup of that group, showing that the only [[subgroup]] of a group G that contains ''g'' is ''G'' itself suffices to show that G is cyclic.

For example, if ''G'' = { ''g''0, ''g''1, ''g''2, ''g''3, ''g''4, ''g''5 } is a group, then ''g''6 = ''g''0, and ''G'' is cyclic. In fact, ''G'' is essentially the same as (that is, [[Isomorphism|isomorphic]] to) the set { 0, 1, 2, 3, 4, 5 } with addition [[modular arithmetic|modulo]] 6. For example, 1 + 2 ≡ 3 (mod 6) corresponds to ''g''1·''g''2 = ''g''3, and 2 + 5 ≡ 1 (mod 6) corresponds to ''g''2·''g''5 = ''g''7 = ''g''1, and so on. One can use the isomorphism χ defined by χ(''g''''i'') = ''i''.

For every positive integer ''n'' there is exactly one cyclic group (up to isomorphism) whose [[Order (group theory)|order]] is ''n'', and there is exactly one infinite cyclic group (the integers under addition). Hence, the cyclic groups are the simplest groups and they are completely classified.

The name "cyclic" may be misleading: it is possible to generate infinitely many elements and not form any literal cycles; that is, every ''g''''n'' is distinct. (It can be said that it has one infinitely long cycle.) A group generated in this way is called an '''infinite cyclic group''', and is isomorphic to the additive group of [[integer]]s '''Z'''.

Furthermore, the [[circle group]] (whose elements are [[uncountable]]) is ''not'' a cyclic group—a cyclic group always has [[countable]] elements.

Since the cyclic groups are [[Abelian group|abelian]], they are often written additively and denoted '''Z'''''n''. However, this notation can be problematic for [[number theory|number theorists]] because it conflicts with the usual notation for [[p-adic number|''p''-adic number]] rings or [[localization of a ring|localization]] at a [[prime ideal]]. The [[quotient group|quotient]] notations '''Z'''/''n'''''Z''', '''Z'''/''n'', and '''Z'''/(''n'') are standard alternatives. We adopt the first of these here to avoid the collision of notation. See also the section [[#Subgroups and notation|Subgroups and notation]] below.

One may write the group multiplicatively, and denote it by ''C''''n'', where ''n'' is the order (which can be ∞). For example, ''g''2''g''4 = ''g''1 in ''C''5, whereas 2 + 4 = 1 in '''Z'''/5'''Z'''.

==Properties==

The [[fundamental theorem of cyclic groups]] states that if ''G'' is a cyclic group of order ''n'' then every [[subgroup]] of ''G'' is cyclic. Moreover, the order of any subgroup of ''G'' is a divisor of ''n'' and for each positive divisor ''k'' of ''n'' the group ''G'' has exactly one subgroup of order ''k''. This property characterizes finite cyclic groups: a group of order ''n'' is cyclic if and only if for every divisor ''d'' of ''n'' the group has at most one subgroup of order ''d''. Sometimes the refined statement is used: a group of order ''n'' is cyclic if and only if for every divisor ''d'' of ''n'' the group has exactly one subgroup of order ''d''.

Every finite cyclic group is [[Isomorphism|isomorphic]] to the group { [0], [1], [2], ..., [''n'' − 1] } of integers modulo ''n'' under addition, and any infinite cyclic group is isomorphic to '''Z''' (the set of all integers) under addition. Thus, one only needs to look at such groups to understand the properties of cyclic groups in general. Hence, cyclic groups are one of the simplest groups to study and a number of nice properties are known.

Given a cyclic group ''G'' of order ''n'' (''n'' may be infinity) and for every ''g'' in ''G'',
* G is [[abelian group|abelian]]; that is, their group operation is commutative: ''gh'' = ''hg'' (for all ''g'' and ''h'' in ''G''). This is so since ''r'' + ''s'' ≡ ''s'' + ''r'' (mod ''n'').
* If ''n'' is finite, then ''g''''n'' = ''g''0 is the identity element of the group, since ''kn'' ≡ 0 (mod ''n'') for any integer ''k''.
* If ''n'' = ∞, then there are exactly two elements that each generate the group: namely 1 and −1 for '''Z'''.
* If ''n'' is finite, then there are exactly φ(''n'') elements that generate the group on their own, where φ is the [[Euler totient function]].
* Every subgroup of ''G'' is cyclic. Indeed, each finite subgroup of ''G'' is a group of { 0, 1, 2, 3, ..., ''m'' − 1 } with addition modulo ''m''. And each infinite subgroup of ''G'' is ''m'''''Z''' for some ''m'', which is bijective to (so isomorphic to) '''Z'''.
* ''C''''n'' is isomorphic to '''Z'''/''n'''''Z''' ([[factor group]] of '''Z''' over ''n'''''Z''') since '''Z'''/''n'''''Z''' = {0 + ''n'''''Z''', 1 + ''n'''''Z''', 2 + ''n'''''Z''', 3 + ''n'''''Z''', 4 + ''n'''''Z''', ..., ''n'' − 1 + ''n'''''Z'''} ≅ { 0, 1, 2, 3, 4, ..., ''n'' − 1 } under addition modulo ''n''.

More generally, if ''d'' is a [[divisor]] of ''n'', then the number of elements in '''Z'''/''n'' which have order ''d'' is φ(''d''). The order of the residue class of ''m'' is ''n'' / [[greatest common divisor|gcd]](''n'',''m'').

If ''p'' is a [[prime number]], then the only group ([[up to]] [[group isomorphism|isomorphism]]) with ''p'' elements is the cyclic group ''C''''p'' or '''Z'''/''p'''''Z'''. There are more numbers with the same property, see [[cyclic number (group theory)|cyclic number]].

The [[direct product of groups|direct product]] of two cyclic groups '''Z'''/''n'''''Z''' and '''Z'''/''m'''''Z''' is cyclic if and only if ''n'' and ''m'' are [[coprime]]. Thus e.g. '''Z'''/12'''Z''' is the direct product of '''Z'''/3'''Z''' and '''Z'''/4'''Z''', but not the direct product of '''Z'''/6'''Z''' and '''Z'''/2'''Z'''.

The definition immediately implies that cyclic groups have [[presentation of a group|group presentation]] ''C''∞ = < ''x'' | > and ''C''''n'' = < ''x'' | ''x''''n'' > for finite ''n''.

A [[primary cyclic group]] is a group of the form '''Z'''/''p''''k'''''Z''' where ''p'' is a [[prime number]]. The [[fundamental theorem of finitely generated abelian groups|fundamental theorem of abelian groups]] states that every [[finitely generated abelian group]] is the direct product of finitely many finite primary cyclic and infinite cyclic groups.

'''Z'''/''n'''''Z''' and '''Z''' are also [[commutative ring]]s. If ''p'' is a prime, then '''Z'''/''p'''''Z''' is a [[finite field]], also denoted by '''F'''''p'' or '''GF'''(''p''). Every field with ''p'' elements is [[isomorphism|isomorphic]] to this one.

The [[Unit (ring theory)|units]] of the ring '''Z'''/''n'''''Z''' are the numbers [[coprime]] to ''n''. They form a [[Multiplicative group of integers modulo n|group under multiplication modulo ''n'']] with φ(''n'') elements (see above). It is written as ('''Z'''/''n'''''Z''')×.
For example, when ''n'' = 6, we get ('''Z'''/''n'''''Z''')× = {1,5}.
When ''n'' = 8, we get ('''Z'''/''n'''''Z''')× = {1,3,5,7}.

In fact, it is known that ('''Z'''/''n'''''Z''')× is cyclic if and only if ''n'' is 1 or 2 or 4 or ''p''''k'' or 2 ''p''''k'' for an [[odd number|odd]] [[prime number]] ''p'' and ''k'' ≥ 1, in which case every generator of ('''Z'''/''n'''''Z''')× is called a [[primitive root modulo n|primitive root modulo ''n'']]. Thus, ('''Z'''/''n'''''Z''')× is cyclic for ''n'' = 6, but not for ''n'' = 8, where it is instead isomorphic to the [[Klein four-group]].

The group ('''Z'''/''p'''''Z''')× is cyclic with ''p'' − 1 elements for every prime ''p'', and is also written ('''Z'''/''p'''''Z''')* because it consists of the non-zero elements. More generally, every ''finite'' [[subgroup]] of the multiplicative group of any [[field (mathematics)|field]] is cyclic.

Let ''G'' be a finite group. Then ''G'' is a cyclic group if, for each ''n'' > 0, ''G'' contains at most ''n'' elements of order dividing ''n''.{{fact|date=August 2012}} For example, it follows immediately from this that the multiplicative group of a finite field is cyclic.

==Examples==
In 2D and 3D the [[symmetry group]] for ''n''-fold [[rotational symmetry]] is ''C''''n'', of abstract group type Zn. In 3D there are also other symmetry groups which are algebraically the same, see [[Point groups in three dimensions#Cyclic 3D symmetry groups|''Symmetry groups in 3D that are cyclic as abstract group'']].

Note that the group ''S''1 of all rotations of a [[circle]] (the [[circle group]]) is ''not'' cyclic, since it is not even [[countable]].

The ''n''th [[root of unity|roots of unity]] form a cyclic group of order ''n'' under multiplication. e.g., {{nowrap|1=0 = ''z''3 − 1 = (''z'' − ''s''0)(''z'' − ''s''1)(''z'' − ''s''2)}} where {{nowrap|1=''s'' = ''e''2π''i''/3}} and a group of {''s''0, ''s''1, ''s''2} under multiplication is cyclic.

The [[Galois group]] of every finite [[field extension]] of a [[finite field]] is finite and cyclic; conversely, given a finite field ''F'' and a finite cyclic group ''G'', there is a finite field extension of ''F'' whose Galois group is ''G''.

==Representation==

The [[cycle graph (group)|cycle graphs]] of finite cyclic groups are all ''n''-sided regular polygons with the elements at the vertices. The dark vertex in the cycle graphs below stand for the identity element, and the other vertices are the other elements of the group. A cycle consists of successive powers of either of the elements connected to the identity element.

<table border="1" cellpadding="2" cellspacing="0" align=center>
<tr>
<th>[[File:GroupDiagramMiniC1.png|center]]</th>
<th>[[File:GroupDiagramMiniC2.png|center]]</th>
<th>[[File:GroupDiagramMiniC3.png|center]]</th>
<th>[[File:GroupDiagramMiniC4.png|center]]</th>
<th>[[File:GroupDiagramMiniC5.png|center]]</th>
<th>[[File:GroupDiagramMiniC6.png|center]]</th>
<th>[[File:GroupDiagramMiniC7.png|center]]</th>
<th>[[File:GroupDiagramMiniC8.png|center]]</th>
</tr>
<tr align="center">
<td>C1</td>
<td>C2</td>
<td>C3</td>
<td>C4</td>
<td>C5</td>
<td>C6</td>
<td>C7</td>
<td>C8</td>
</tr>
</table>

The [[Representation theory of finite groups|representation theory]] of the cyclic group is a critical base case for the representation theory of more general finite groups. In the [[character theory|complex case]], a representation of a cyclic group decomposes into a direct sum of linear characters, making the connection between character theory and representation theory transparent. In the [[modular representation theory|positive characteristic case]], the indecomposable representations of the cyclic group form a model and inductive basis for the representation theory of groups with cyclic [[Sylow subgroup]]s and more generally the representation theory of blocks of cyclic defect.

==Subgroups and notation==

All [[subgroup]]s and [[quotient group]]s of cyclic groups are cyclic. Specifically, all subgroups of '''Z''' are of the form ''m'''''Z''', with ''m'' an integer ≥0. All of these subgroups are different, and apart from the trivial group (for ''m''=0) all are [[isomorphic]] to '''Z'''. The [[lattice of subgroups]] of '''Z''' is isomorphic to the [[Duality (order theory)|dual]] of the lattice of natural numbers ordered by [[divisibility]]. All factor groups of '''Z''' are finite, except for the trivial exception '''Z'''/{0} = '''Z'''/0'''Z'''. For every positive divisor ''d'' of ''n'', the quotient group '''Z'''/''n'''''Z''' has precisely one subgroup of order ''d'', the one generated by the residue class of ''n''/''d''. There are no other subgroups. The lattice of subgroups is thus isomorphic to the set of divisors of ''n'', ordered by divisibility. In particular, a cyclic group is [[simple group|simple]] if and only if its order (the number of its elements) is prime.<ref>Gannon (2006), {{Google books quote|id=ehrUt21SnsoC|page=18|text=Zn is simple iff n is prime|p. 18}}</ref>

Using the quotient group formalism, '''Z'''/''n'''''Z''' is a standard notation for the additive cyclic group with ''n'' elements. In [[Ring (mathematics)|ring]] terminology, the subgroup ''n'''''Z''' is also the [[ideal (ring theory)|ideal]] (''n''), so the quotient can also be written '''Z'''/(''n'') or '''Z'''/''n'' without abuse of notation. These alternatives do not conflict with the notation for the ''p''-adic integers. The last form is very common in informal calculations; it has the additional advantage that it reads the same way that the group or ring is often described verbally in English, "Zee mod en".

==Endomorphisms==

The [[endomorphism ring]] of the abelian group '''Z'''/''n'''''Z''' is [[ring homomorphism|isomorphic]] to '''Z'''/''n'''''Z''' itself as a [[ring (algebra)|ring]]. Under this isomorphism, the number ''r'' corresponds to the endomorphism of '''Z'''/''n'''''Z''' that maps each element to the sum of ''r'' copies of it. This is a bijection if and only if ''r'' is coprime with ''n'', so the [[automorphism group]] of '''Z'''/''n'''''Z''' is isomorphic to the unit group ('''Z'''/''n'''''Z''')× (see above).

Similarly, the endomorphism ring of the additive group '''Z''' is isomorphic to the ring '''Z'''. Its automorphism group is isomorphic to the group of units of the ring '''Z''', i.e. to {−1, +1} <math> \cong</math> ''C''2.

==Virtually cyclic groups==

A group is called '''virtually cyclic''' if it contains a cyclic subgroup of finite [[index (group theory)|index]] (the number of [[coset]]s that the subgroup has).
In other words, any element in a virtually cyclic group can be arrived at by applying a member of the cyclic subgroup to a member in a certain finite set.
Every cyclic group is virtually cyclic, as is every finite group.
It is known that a finitely generated [[discrete group]] with exactly two ''[[end (topology)|ends]]'' is virtually cyclic (for instance the [[direct product of groups|product]] of '''Z'''/''n'' and '''Z'''). Every abelian subgroup of a [[hyperbolic group|Gromov hyperbolic group]] is virtually cyclic.

==See also==
*[[Cyclic extension]]
*[[Cyclic module]]
*[[Cyclically ordered group]]
*[[Locally cyclic group]], a group in which each finitely generated subgroup is cyclic
*[[Modular arithmetic]]

==Notes==
{{Reflist}}

==References==
*{{Citation | last1=Gallian | first1=Joseph | title=Contemporary abstract algebra | publisher=Houghton Mifflin | location=Boston | edition=4th | isbn=978-0-669-86179-2 | year=1998 }}, especially chapter 4.
*{{Citation | last1=Herstein | first1=I. N. | title=Abstract algebra | publisher=[[Prentice Hall]] | edition=3rd | isbn=978-0-13-374562-7 | mr=1375019 | year=1996 }}, especially pages 53–60.
*{{Citation |last1=Gannon |first1=Terry |authorlink1= |last2= |first2= |authorlink2= |title=Moonshine beyond the monster: the bridge connecting algebra, modular forms and physics |url= |edition= |series=Cambridge monographs on mathematical physics |volume= |year=2006 |publisher=Cambridge University Press |location= |isbn=978-0-521-83531-2 }}
*Milne, Group theory, http://www.jmilne.org/math/CourseNotes/gt.html

==External links==
* {{springer|title=Cyclic group|id=p/c027510}}
* [http://members.tripod.com/~dogschool/cyclic.html An introduction to cyclic groups]

==Further reading==
*{{Citation |last=Lajoie |first=Caroline |last2=Mura |first2=Roberta |date=November 2000 |title=What's in a Name? A Learning Difficulty in Connection with Cyclic Groups |journal=For the Learning of Mathematics |volume=20 |issue=3 |pages=29–33 |jstor=40248334}}

{{DEFAULTSORT:Cyclic Group}}
[[Category:Abelian group theory]]
[[Category:Finite groups]]
[[Category:Properties of groups]]

[[ar:زمرة دائرية]]
[[ca:Grup cíclic]]
[[cs:Cyklická grupa]]
[[de:Zyklische Gruppe]]
[[es:Grupo cíclico]]
[[eo:Cikla grupo]]
[[fr:Groupe cyclique]]
[[ko:순환군]]
[[it:Gruppo ciclico]]
[[he:חבורה ציקלית]]
[[hu:Ciklikus csoport]]
[[nl:Cyclische groep]]
[[ja:巡回群]]
[[ml:ചാക്രികഗ്രൂപ്പ്]]
[[pl:Grupa cykliczna]]
[[pt:Grupo cíclico]]
[[ru:Циклическая группа]]
[[sr:Циклична група]]
[[sh:Ciklična grupa]]
[[fi:Syklinen ryhmä]]
[[sv:Cyklisk grupp]]
[[ta:சுழற் குலம்]]
[[uk:Циклічна група]]
[[vi:Nhóm cyclic]]
[[zh:循環群]]

Filter (mathematics)

2012-11-24T09:31:09Z

Magmalex: added "partially"

[[Image:Upset.svg|thumb|The powerset algebra of the set <math>\{1,2,3,4\}</math> with the upset <math>\uparrow\!\{1\}</math> colored green. The green elements make a ''principal ultrafilter'' on the lattice.]]
In [[mathematics]], a '''filter''' is a special [[subset]] of a [[partially ordered set]]. A frequently used special case is the situation that the partially ordered set under consideration is just the [[power set]] of some set, ordered by set inclusion. Filters appear in [[order theory|order]] and [[lattice theory]], but can also be found in [[topology]] whence they originate. The [[duality (order theory)|dual]] notion of a filter is an [[ideal (order theory)|ideal]].

Filters were introduced by [[Henri Cartan]] in 1937<ref>H. Cartan, "Théorie des filtres". ''CR Acad. Paris'', '''205''', (1937) 595–598.</ref><ref>H. Cartan, "Filtres et ultrafiltres" ''CR Acad. Paris'', '''205''', (1937) 777–779.</ref> and subsequently used by [[Bourbaki]] in their book ''[[Topologie Générale]]'' as an alternative to the similar notion of a [[net (topology)|net]] developed in 1922 by [[E. H. Moore]] and [[H. L. Smith]].

== General definition ==
A [[non-empty]] subset ''F'' of a partially ordered set (''P'',≤) is a '''filter''' if the following conditions hold:

# For every ''x'', ''y'' in ''F'', there is some element ''z'' in ''F'' such that ''z'' ≤ ''x'' and ''z'' ≤ ''y''. (''F'' is a '''filter base''')
# For every ''x'' in ''F'' and ''y'' in ''P'', ''x'' ≤ ''y'' implies that ''y'' is in ''F''. (''F'' is an ''[[upper set]]'')
# A filter is '''proper''' if it is not equal to the whole set ''P''. This is sometimes omitted from the definition of a filter.

While the above definition is the most general way to define a filter for arbitrary [[Partially ordered set|posets]], it was originally defined for [[lattice (order)|lattice]]s only. In this case, the above definition can be characterized by the following equivalent statement:
A non-empty subset ''F'' of a lattice (''P'',≤) is a filter, [[if and only if]] it is an upper set that is closed under finite meets ([[infimum|infima]]), i.e., for all ''x'', ''y'' in ''F'', we find that ''x'' ∧ ''y'' is also in ''F''.

The smallest filter that contains a given element ''p'' is a '''principal filter''' and ''p'' is a '''principal element''' in this situation. The principal filter for ''p'' is just given by the set {''x'' in ''P'' | ''p'' ≤ ''x''} and is denoted by prefixing ''p'' with an upward arrow: <math>\uparrow p</math>.

The dual notion of a filter, i.e. the concept obtained by reversing all ≤ and exchanging ∧ with ∨, is '''ideal'''. Because of this duality, the discussion of filters usually boils down to the discussion of ideals. Hence, most additional information on this topic (including the definition of '''maximal filters''' and '''prime filters''') is to be found in the article on [[ideal (order theory)|ideals]]. There is a separate article on [[ultrafilter]]s.

== Filter on a set ==
A special case of a filter is a filter defined on a set. Given a set ''S'', a partial ordering ⊆ can be defined on the [[powerset]] '''P'''(''S'') by subset inclusion, turning ('''P'''(''S''),⊆) into a lattice. Define a '''filter''' ''F'' on ''S'' as a subset of '''P'''(''S'') with the following properties:

# ''S'' is in ''F''. (''F is non-empty'')
# The empty set is not in ''F''. (''F is proper'')
# If ''A'' and ''B'' are in ''F'', then so is their intersection. (''F is closed under finite meets'')
# If ''A'' is in ''F'' and ''A'' is a subset of ''B'', then ''B'' is in ''F'', for all subsets ''B'' of ''S''. (''F is an upper set'')

The first three properties imply that a '''filter on a set''' has the [[finite intersection property]]. Note that with this definition, a filter on a set is indeed a filter; in fact, it is a proper filter. Because of this, sometimes this is called a '''proper filter on a set'''; however, as long as the set context is clear, the shorter name is sufficient.

A '''filter base''' (or '''filter basis''') is a subset ''B'' of '''P'''(''S'') with the following properties:
# The intersection of any two sets of ''B'' contains a set of ''B''
# ''B'' is non-empty and the empty set is not in ''B''

Given a filter base ''B'', one may obtain a (proper) filter by including all sets of '''P'''(''S'') which contain a set of ''B''. The resulting filter is said to be generated by or spanned by filter base ''B''. Every filter is also a filter
base, so the process of passing from filter base to filter may
be viewed as a sort of completion.

If ''B'' and ''C'' are two filter bases on ''S'', one says ''C'' is '''finer''' than ''B'' (or that ''C'' is a '''refinement''' of ''B'') if for each ''B''0 ∈ ''B'', there is a ''C''0 ∈ ''C'' such that ''C''0 ⊆ ''B''0.
* For filter bases ''B'' and ''C'', if ''B'' is finer than ''C'' and ''C'' is finer than ''B'', then ''B'' and ''C'' are said to be '''equivalent filter bases'''. Two filter bases are equivalent if and only if the filters they generate are equal.
* For filter bases ''A'', ''B'', and ''C'', if ''A'' is finer than ''B'' and ''B'' is finer than ''C'' then ''A'' is finer than ''C''. Thus the refinement relation is a [[preorder]] on the set of filter bases, and the passage from filter base to filter is an instance of passing from a preordering to the associated partial ordering.

Given a subset ''T'' of '''P'''(''S'') we can ask whether there exists a smallest filter ''F'' containing ''T''. Such a filter exists if and only if the finite intersection of subsets of ''T'' is non-empty. We call ''T'' a '''subbase''' of ''F'' and say ''F'' is '''generated''' by ''T''. ''F'' can be constructed by taking all finite intersections of ''T'' which is then filter base for ''F''.

=== Examples ===
* Let ''S'' be a nonempty set and ''C'' be a nonempty subset. Then <math>\{ C \}</math> is a filter base. The filter it generates (i.e., the collection of all subsets containing ''C'') is called the '''principal filter''' generated by ''C''.

* A filter is said to be a '''free filter''' if the intersection of all of its members is empty. A principal filter is not free. Since the intersection of any finite number of members of a filter is also a member, no filter on a finite set is free, and indeed is the principal filter generated by the common intersection of all of its members. A nonprincipal filter on an infinite set is not necessarily free.

* The [[Fréchet filter]] on an infinite set ''S'' is the set of all subsets of ''S'' that have finite complement. The Fréchet filter is free, and it is contained in every free filter on ''S''.

* Every [[uniform structure]] on a set ''X'' is a filter on ''X''×''X''.

* A filter in a [[poset]] can be created using the [[Rasiowa-Sikorski lemma]], often used in [[forcing (mathematics)|forcing]].

* The set <math>\{ \{ N, N+1, N+2, \dots \} : N \in \{1,2,3,\dots\} \}</math> is called a ''filter base of tails'' of the sequence of natural numbers <math>(1,2,3,\dots)</math>. A filter base of tails can be made of any [[net (mathematics)|net]] <math>(x_\alpha)_{\alpha \in A}</math> using the construction <math>\{ \{ x_\alpha : \alpha \in A, \alpha_0 \leq a \} : \alpha_0 \in A \}\,</math>. Therefore, all nets generate a filter base (and therefore a filter). Since all sequences are nets, this holds for sequences as well.

=== Filters in model theory ===
For any filter ''F'' on a set ''S'', the set function defined by
:<math>
m(A)=
\begin{cases}
1 & \text{if }A\in F \\
0 & \text{if }S\setminus A\in F \\
\text{undefined} & \text{otherwise}
\end{cases}
</math>
is finitely additive — a "[[measure (mathematics)|measure]]" if that term is construed rather loosely. Therefore the statement

:<math>\left\{\,x\in S: \varphi(x)\,\right\}\in F</math>

can be considered somewhat analogous to the statement that φ holds "almost everywhere". That interpretation of membership in a filter is used (for motivation, although it is not needed for actual ''proofs'') in the theory of [[ultraproduct]]s in [[model theory]], a branch of [[mathematical logic]].

=== Filters in topology ===
In [[topology]] and analysis, filters are used to define convergence in a manner similar to the role of [[sequence]]s in a [[metric space]].

In topology and related areas of mathematics, a filter is a generalization of a [[net (mathematics)|net]]. Both nets and filters provide very general contexts to unify the various notions of [[Limit (mathematics)|limit]] to arbitrary [[topological space]]s.

A [[sequence]] is usually indexed by the [[natural numbers]], which are a [[totally ordered set]]. Thus, limits in [[first-countable space]]s can be described by sequences. However, if the space is not first-countable, nets or filters must be used. Nets generalize the notion of a sequence by requiring the index set simply be a [[directed set]]. Filters can be thought of as sets built from multiple nets. Therefore, both the limit of a filter and the limit of a net are conceptually the same as the limit of a sequence.

====Neighbourhood bases====

Let ''X'' be a topological space and ''x'' a point of ''X''.

* Take ''N''''x'' to be the '''[[Neighbourhood system|neighbourhood filter]]''' at point ''x'' for ''X''. This means that ''N''''x'' is the set of all topological [[neighbourhood (mathematics)|neighbourhood]]s of the point ''x''. It can be verified that ''N''''x'' is a filter. A '''neighbourhood system''' is another name for a '''neighbourhood filter'''.

* To say that ''N'' is a '''neighbourhood base''' at ''x'' for ''X'' means that each subset ''V''0 of X is a neighbourhood of ''x'' if and only if there exists ''N''0 ∈ ''N'' such that ''N''0 ⊆ ''V''0. Note that every neighbourhood base at ''x'' is a filter base that generates the neighbourhood filter at ''x''.

====Convergent filter bases====

Let ''X'' be a topological space and ''x'' a point of ''X''.

* To say that a filter base ''B'' '''converges''' to ''x'', denoted ''B'' → ''x'', means that for every neighbourhood ''U'' of ''x'', there is a ''B''0 ∈ ''B'' such that ''B''0 ⊆ ''U''. In this case, ''x'' is called a [[Limit (mathematics)|limit]] of ''B'' and ''B'' is called a '''convergent filter base'''.

* For every neighbourhood base ''N'' of ''x'', ''N'' → ''x''.
** If ''N'' is a neighbourhood base at ''x'' and ''C'' is a filter base on ''X'', then ''C'' → ''x'' [[if and only if]] ''C'' is finer than ''N''.
** For ''Y'' ⊆ ''X'', to say that ''p'' is a limit point of ''Y'' in ''X'' means that for each neighborhood ''U'' of ''p'' in ''X'', ''U''∩(''Y'' − {''p''})≠∅.
** For ''Y'' ⊆ ''X'', ''p'' is a limit point of ''Y'' in ''X'' if and only if there exists a filter base ''B'' on ''Y'' − {''p''} such that ''B'' → ''p''.
* For ''Y'' ⊆ ''X'', the following are equivalent:
** (i) There exists a filter base ''F'' whose elements are all contained in ''Y'' such that ''F'' → ''x''.
** (ii) There exists a filter ''F'' such that ''Y'' is an element of ''F'' and ''F'' → ''x''.
** (iii) The point ''x'' lies in the closure of ''Y''.

Indeed:

(i) implies (ii): if ''F'' is a filter base satisfying the properties of (i), then the filter associated to ''F'' satisfies the properties of (ii).

(ii) implies (iii): if ''U'' is any open neighborhood of ''x'' then by the definition of convergence ''U'' is an element of ''F''; since also ''Y'' is an element of ''F'',
''U'' and ''Y'' have nonempty intersection.

(iii) implies (i): Define <math> F = \{ U \cap Y \ | \ U \in N_x \}</math>. Then ''F'' is a filter base satisfying the properties of (i).

====Clustering====

Let ''X'' be a topological space and ''x'' a point of ''X''.

* To say that ''x'' is a [[cluster point]] for a filter base ''B'' on ''X'' means that for each ''B''0 ∈ ''B'' and for each neighbourhood ''U'' of ''x'' in ''X'', ''B''0∩''U''≠∅. In this case, ''B'' is said to '''cluster''' at point ''x''.
** For filter bases ''B'' and ''C'' such that ''C'' is finer than ''B'' and ''C'' clusters at point ''x'', ''B'' clusters at ''x'', too.
** For filter base ''B'' such that ''B'' → ''x'', the limit ''x'' is a cluster point.
** For filter base ''B'' with cluster point ''x'', it is '''not''' the case that ''x'' is necessarily a limit.
** For a filter base ''B'' that clusters at point ''x'', there is a filter base ''C'' that is finer than filter base ''B'' that converges to ''x''.
** For a filter base ''B'', the set ∩{cl(''B''0) : ''B''0∈''B''} is the set of all cluster points of ''B'' (note: cl(''B''0) is the [[closure (topology)|closure]] of ''B''0). Assume that ''T'' is a [[partially ordered set]].
*** The [[limit inferior]] of ''B'' is the [[infimum]] of the set of all cluster points of ''B''.
*** The [[limit superior]] of ''B'' is the [[supremum]] of the set of all cluster points of ''B''.
*** ''B'' is a convergent filter base [[if and only if]] its limit inferior and limit superior agree; in this case, the value on which they agree is the limit of the filter base.

====Properties of a topological space====

Let ''X'' be a topological space.

* ''X'' is a [[Hausdorff space]] [[if and only if]] every filter base on ''X'' has at most one limit.
* ''X'' is [[Compact space|compact]] if and only if every filter base on ''X'' clusters.
* ''X'' is compact if and only if every filter base on ''X'' is a subset of a convergent filter base.
* ''X'' is compact if and only if every [[ultrafilter]] on ''X'' converges.

====Functions on topological spaces====

Let ''X'', ''Y'' be topological spaces. Let ''B'' be a filter base on ''X'' and <math>f\colon X \to Y</math> be a function. The [[Image (mathematics)|image]] of ''B'' under ''f'' is ''f''[''B''] is the set <math>\{ f(x) : x \in B \}</math>. The image ''f''[''B''] forms a filter base on ''Y''.
* ''f'' is [[Continuous function (topology)|continuous]] at ''x'' if and only if <math>B \to x</math> implies <math>f[B] \to f(x)</math>.

==== Cauchy filters ====

Let <math>(X,d)</math> be a [[metric space]].
* To say that a filter base ''B'' on ''X'' is '''Cauchy''' means that for each [[real number]] ε>0, there is a ''B''0 ∈ ''B'' such that the metric [[diameter]] of ''B''0 is less than ε.
* Take (''xn'') to be a [[sequence]] in metric space ''X''. (''xn'') is a [[Cauchy sequence]] if and only if the filter base {{''xN,xN''+1,...} : ''N'' ∈ {1,2,3,...} } is Cauchy.

More generally, given a [[uniform space]] ''X'', a filter ''F'' on ''X'' is called '''Cauchy filter''' if for every [[entourage (topology)|entourage]] ''U'' there is an ''A'' ∈ ''F'' with (''x,y'') ∈ ''U'' for all ''x,y'' ∈ ''A''. In a metric space this agrees with the previous definition. ''X'' is said to be complete if every Cauchy filter converges. Conversely, on a uniform space every convergent filter is a Cauchy filter. Moreover, every cluster point of a Cauchy filter is a limit point.

A compact uniform space is complete: on a compact space each filter has a cluster point, and if the filter is Cauchy, such a cluster point is a limit point. Further, a uniformity is compact if and only if it is complete and [[totally bounded]].

Most generally, a [[Cauchy space]] is a set equipped with a class of filters declared to be Cauchy. These are required to have the following properties:
# for each ''x'' in ''X'', the [[ultrafilter]] at ''x'', ''U''(''x''), is Cauchy.
# if ''F'' is a Cauchy filter, and ''F'' is a subset of a filter ''G'', then ''G'' is Cauchy.
# if ''F'' and ''G'' are Cauchy filters and each member of ''F'' intersects each member of ''G'', then ''F'' ∩ ''G'' is Cauchy.
The Cauchy filters on a uniform space have these properties, so every uniform space (hence every metric space) defines a Cauchy space.

== See also ==
* [[Ultrafilter]]
* [[Filtration (mathematics)]]
* [[Net (mathematics)]]

== Notes ==
{{reflist}}

== References ==
*[[Nicolas Bourbaki]], <cite>General Topology</cite> (<cite>Topologie Générale</cite>), ISBN 0-387-19374-X (Ch. 1-4): Provides a good reference for filters in general topology (Chapter I) and for Cauchy filters in uniform spaces (Chapter II)
* Stephen Willard, ''General Topology'', (1970) Addison-Wesley Publishing Company, Reading Massachusetts. ''(Provides an introductory review of filters in topology.)''
*David MacIver, ''[http://www.efnet-math.org/~david/mathematics/filters.pdf Filters in Analysis and Topology]'' (2004) ''(Provides an introductory review of filters in topology and in metric spaces.)''
* Burris, Stanley N., and H.P. Sankappanavar, H. P., 1981. ''[http://www.thoralf.uwaterloo.ca/htdocs/ualg.html A Course in Universal Algebra.]'' Springer-Verlag. ISBN 3-540-90578-2.
* [[Victor Porton]]. ''[http://ijpam.eu/contents/2012-74-1/6/6.pdf Filters on Posets and Generalizations]'' (2012). [http://ijpam.eu IJPAM]

[[Category:Order theory]]
[[Category:General topology]]

[[cs:Filtr (matematika)]]
[[de:Filter (Mathematik)]]
[[et:Filter (matemaatika)]]
[[es:Filtro (matemáticas)]]
[[fr:Filtre (mathématiques)]]
[[ko:필터 (수학)]]
[[it:Filtro (matematica)]]
[[he:מסנן (תורת הקבוצות)]]
[[nl:Filter (wiskunde)]]
[[ja:フィルター (数学)]]
[[pms:Fìlter]]
[[pl:Filtr (matematyka)]]
[[pt:Filtro (teoria dos conjuntos)]]
[[ru:Фильтр (математика)]]
[[uk:Фільтр (порядок)]]
[[zh:滤子 (数学)]]

Limit (category theory)

2012-11-24T09:01:02Z

Magmalex: /* Limits */

In [[category theory]], a branch of [[mathematics]], the abstract notion of a '''limit''' captures the essential properties of universal constructions such as [[product (category theory)|products]], [[pullback (category theory)|pullbacks]] and [[inverse limit]]s. The [[duality (category theory)|dual notion]] of a '''colimit''' generalizes constructions such as [[disjoint union]]s, [[direct sum]]s, [[coproduct]]s, [[pushout (category theory)|pushout]]s and [[direct limit]]s.

Limits and colimits, like the strongly related notions of [[universal property|universal properties]] and [[adjoint functors]], exist at a high level of abstraction. In order to understand them, it is helpful to first study the specific examples these concepts are meant to generalize.

== Definition ==
Limits and colimits in a [[category (mathematics)|category]] ''C'' are defined by means of diagrams in ''C''. Formally, a '''[[diagram (category theory)|diagram]]''' of type ''J'' in ''C'' is a [[functor]] from ''J'' to ''C'':
:''F'' : ''J'' → ''C''.
The category ''J'' is thought of as [[index category]], and the diagram ''F'' is thought of as indexing a collection of objects and morphisms in ''C'' patterned on ''J''. The actual objects and morphisms in ''J'' are largely irrelevant—only the way in which they are interrelated matters.

One is most often interested in the case where the category ''J'' is a [[small category|small]] or even [[Finite set|finite]] category. A diagram is said to be '''small''' or '''finite''' whenever ''J'' is.

===Limits===
Let ''F'' : ''J'' → ''C'' be a diagram of type ''J'' in a category ''C''. A '''[[cone (category theory)|cone]]''' to ''F'' is an object ''N'' of ''C'' together with a family ψ''X'' : ''N'' → ''F''(''X'') of morphisms indexed by the objects of ''J'', such that for every morphism ''f'' : ''X'' → ''Y'' in ''J'', we have ''F''(''f'') o ψ''X'' = ψ''Y''.

A '''limit''' of the diagram ''F'' : ''J'' → ''C'' is a cone (''L'', φ) to ''F'' such that for any other cone (''N'', ψ) to ''F'' there exists a ''unique'' morphism ''u'' : ''N'' → ''L'' such that φ''X'' o ''u'' = ψ''X'' for all ''X'' in ''J''.
[[File:Functor cone (extended).svg|center|A universal cone]]
One says that the cone (''N'', ψ) factors through the cone (''L'', φ) with
the unique factorization ''u''. The morphism ''u'' is sometimes called the '''mediating morphism'''.

Limits are also referred to as ''[[universal cone]]s'', since they are characterized by a [[universal property]] (see below for more information). As with every universal property, the above definition describes a balanced state of generality: The limit object ''L'' has to be general enough to allow any other cone to factor through it; on the other hand, ''L'' has to be sufficiently specific, so that only ''one'' such factorization is possible for every cone.

Limits may also be characterized as [[terminal object]]s in the [[category of cones]] to ''F''.

It is possible that a diagram does not have a limit at all. However, if a diagram does have a limit then this limit is essentially unique: it is unique [[up to]] a unique isomorphism. For this reason one often speaks of ''the'' limit of ''F''.

===Colimits===
The [[Dual (category theory)|dual notions]] of limits and cones are colimits and co-cones. Although it is straightforward to obtain the definitions of these by inverting all morphisms in the above definitions, we will explicitly state them here:

A '''[[co-cone]]''' of a diagram ''F'' : ''J'' → ''C'' is an object ''N'' of ''C'' together with a family of morphisms
:ψ''X'' : ''F''(''X'') → ''N''
for every object ''X'' of ''J'', such that for every morphism ''f'' : ''X'' → ''Y'' in ''J'', we have ψ''Y'' o ''F''(''f'')= ψ''X''.

A '''colimit''' of a diagram ''F'' : ''J'' → ''C'' is a co-cone (''L'', <math>\phi</math>) of ''F'' such that for any other co-cone (''N'', ψ) of ''F'' there exists a unique morphism ''u'' : ''L'' → ''N'' such that ''u'' o <math>\phi</math>''X'' = ψ''X'' for all ''X'' in ''J''.

[[File:Functor co-cone (extended).svg|center|A universal co-cone]]

Colimits are also referred to as ''[[universal co-cone]]s''. They can be characterized as [[initial object]]s in the [[category of co-cones]] from ''F''.

As with limits, if a diagram ''F'' has a colimit then this colimit is unique up to a unique isomorphism.

===Variations===
Limits and colimits can also be defined for collections of objects and morphisms without the use of diagrams. The definitions are the same (note that in definitions above we never needed to use composition of morphisms in ''J''). This variation, however, adds no new information. Any collection of objects and morphisms defines a (possibly large) [[directed graph]] ''G''. If we let ''J'' be the [[free category]] generated by ''G'', there is a universal diagram ''F'' : ''J'' → ''C'' whose image contains ''G''. The limit (or colimit) of this diagram is the same as the limit (or colimit) of the original collection of objects and morphisms.

'''Weak limit''' and '''weak colimits''' are defined like limits and colimits, except that the uniqueness property of the mediating morphism is dropped.

==Examples==
===Limits===
The definition of limits is general enough to subsume several constructions useful in practical settings. In the following we will consider the limit (''L'', φ) of a diagram ''F'' : ''J'' → ''C''.
*'''[[Terminal object]]s'''. If ''J'' is the empty category there is only one diagram of type ''J'': the empty one (similar to the [[empty function]] in set theory). A cone to the empty diagram is essentially just an object of ''C''. The limit of ''F'' is any object that is uniquely factored through by every other object. This is just the definition of a ''terminal object''.
*'''[[Product (category theory)|Products]]'''. If ''J'' is a [[discrete category]] then a diagram ''F'' is essentially nothing but a [[indexed family|family]] of objects of ''C'', indexed by ''J''. The limit ''L'' of ''F'' is called the ''product'' of these objects. The cone φ consists of a family of morphisms φ''X'' : ''L'' → ''F''(''X'') called the ''projections'' of the product. In the [[category of sets]], for instance, the products are given by [[Cartesian product]]s and the projections are just the natural projections onto the various factors.
**'''Powers'''. A special case of a product is when the diagram ''F'' is a constant functor to an object ''X'' of ''C''. The limit of this diagram is called the ''Jth power'' of ''X'' and denoted ''X''''J''.
*'''[[Equalizer (mathematics)|Equalizer]]s'''. If ''J'' is a category with two objects and two parallel morphisms from object ''1'' to object ''2'' then a diagram of type ''J'' is a pair of parallel morphisms in ''C''. The limit ''L'' of such a diagram is called an ''equalizer'' of those morphisms.
**'''[[Kernel (category theory)|Kernel]]s'''. A ''kernel'' is a special case of an equalizer where one of the morphisms is a [[zero morphism]].
*'''[[Pullback (category theory)|Pullbacks]]'''. Let ''F'' be a diagram that picks out three objects ''X'', ''Y'', and ''Z'' in ''C'', where the only non-identity morphisms are ''f'' : ''X'' → ''Z'' and ''g'' : ''Y'' → ''Z''. The limit ''L'' of ''F'' is called a ''pullback'' or a ''fiber product''. It can nicely be visualized as a [[commutative diagram|commutative square]]:
[[Image:CategoricalPullback-01.png|center]]
*'''[[Inverse limit]]s'''. Let ''J'' be a [[directed set|directed]] [[poset]] (considered as a small category by adding arrows ''i'' → ''j'' if and only if ''i'' ≤ ''j'') and let ''F'' : ''J''op → ''C'' be a diagram. The limit of ''F'' is called (confusingly) an ''inverse limit'', ''projective limit'', or ''directed limit''.
*If ''J'' = '''1''', the category with a single object and morphism, then a diagram of type ''J'' is essentially just an object ''X'' of ''C''. A cone to an object ''X'' is just a morphism with codomain ''X''. A morphism ''f'' : ''Y'' → ''X'' is a limit of the diagram ''X'' if and only if ''f'' is an [[isomorphism]]. More generally, if ''J'' is any category with an [[initial object]] ''i'', then any diagram of type ''J'' has a limit, namely any object isomorphic to ''F''(''i''). Such an isomorphism uniquely determines a universal cone to ''F''.
*'''Topological limits'''. Limits of functions are a special case of [[Filter_(mathematics)#Convergent_filter_bases|limits of filters]], which are related to categorical limits as follows. Given a [[topological space]] ''X'', denote ''F'' the set of filters on ''X'', ''x'' ∈ ''X'' a point, ''V''(''x'') ∈ ''F'' the [[Filter_(mathematics)#Neighbourhood_bases|neighborhood filter]] of ''x'', ''A'' ∈ ''F'' a particular filter and <math>F_{x,A}=\{G\in F\, /\, V(x)\cup A \subset G\} </math> the set of filters finer than ''A'' and that converge to ''x''. The filters ''F'' are given a small and thin category structure by adding an arrow ''A'' → ''B'' if and only if ''A'' ⊆ ''B''. The injection <math>I_{x,A}:F_{x,A}\to F</math> becomes a functor and the following equivalence holds :

:: ''x'' is a topological limit of ''A'' if and only if ''A'' is a categorical limit of <math>I_{x,A}</math>

===Colimits===
Examples of colimits are given by the dual versions of the examples above:
*'''[[Initial object]]s''' are colimits of empty diagrams.
*'''[[Coproduct]]s''' are colimits of diagrams indexed by discrete categories.
**'''Copowers''' are colimits of constant diagrams from discrete categories.
*'''[[Coequalizer]]s''' are colimits of a parallel pair of morphisms.
**'''[[Cokernel]]s''' are coequalizers of a morphism and a parallel zero morphism.
*'''[[Pushout (category theory)|Pushouts]]''' are colimits of a pair of morphisms with common domain.
*'''[[Direct limit]]s''' are colimits of diagrams indexed by directed sets.

== Properties ==
=== Existence of limits ===
A given diagram ''F'' : ''J'' → ''C'' may or may not have a limit (or colimit) in ''C''. Indeed, there may not even be a cone to ''F'', let alone a universal cone.

A category ''C'' is said to '''have limits of type ''J''''' if every diagram of type ''J'' has a limit in ''C''. Specifically, a category ''C'' is said to
*'''have products''' if it has limits of type ''J'' for every ''small'' discrete category ''J'' (it need not have large products),
*'''have equalizers''' if it has limits of type <math>\bullet\rightrightarrows\bullet</math> (i.e. every parallel pair of morphisms has an equalizer),
*'''have pullbacks''' if it has limits of type <math>\bullet\rightarrow\bullet\leftarrow\bullet</math> (i.e. every pair of morphisms with common codomain has a pullback).
A '''[[complete category]]''' is a category that has all small limits (i.e. all limits of type ''J'' for every small category ''J'').

One can also make the dual definitions. A category '''has colimits of type ''J''''' if every diagram of type ''J'' has a colimit in ''C''. A '''[[cocomplete category]]''' is one that has all small colimits.

The '''existence theorem for limits''' states that if a category ''C'' has equalizers and all products indexed by the classes Ob(''J'') and Hom(''J''), then ''C'' has all limits of type ''J''. In this case, the limit of a diagram ''F'' : ''J'' → ''C'' can be constructed as the equalizer of the two morphisms
:<math>s,t : \prod_{i\in\mathrm{Ob}(J)}F(i) \rightrightarrows \prod_{f\in\mathrm{Hom}(J)} F(\mathrm{cod}(f))</math>
given (in component form) by
:<math>\begin{align}
s &= \bigl( F(f)\circ\pi_{F(\mathrm{dom}(f))}\bigr)_{f\in\mathrm{Hom}(J)} \\
t &= \bigl( \pi_{F(\mathrm{cod}(f))}\bigr)_{f\in\mathrm{Hom}(J)}.
\end{align}</math>
There is a dual '''existence theorem for colimits''' in terms of coequalizers and coproducts. Both of these theorems give sufficient but not necessary conditions for the existence of all (co)limits of type ''J''.

=== Universal property ===
Limits and colimits are important special cases of [[universal construction]]s. Let ''C'' be a category and let ''J'' be a small index category. The [[functor category]] ''C''''J'' may be thought of the category of all diagrams of type ''J'' in ''C''. The ''[[diagonal functor]]''
:<math>\Delta : \mathcal C \to \mathcal C^{\mathcal J}</math>
is the functor that maps each object ''N'' in ''C'' to the constant functor Δ(''N'') : ''J'' → ''C'' to ''N''. That is, Δ(''N'')(''X'') = ''N'' for each object ''X'' in ''J'' and Δ(''N'')(''f'') = id''N'' for each morphism ''f'' in ''J''.

Given a diagram ''F'': ''J'' → ''C'' (thought of as an object in ''C''''J''), a [[natural transformation]] ψ : Δ(''N'') → ''F'' (which is just a morphism in the category ''C''''J'') is the same thing as a cone from ''N'' to ''F''. The components of ψ are the morphisms ψ''X'' : ''N'' → ''F''(''X''). Dually, a natural transformation ψ : ''F'' → Δ(''N'') is the same thing as a co-cone from ''F'' to ''N''.

The definitions of limits and colimits can then be restated in the form:
*A limit of ''F'' is a universal morphism from Δ to ''F''.
*A colimit of ''F'' is a universal morphism from ''F'' to Δ.

=== Adjunctions ===
Like all universal constructions, the formation of limits and colimits is functorial in nature. In other words, if every diagram of type ''J'' has a limit in ''C'' (for ''J'' small) there exists a '''limit functor'''
:<math>\mathrm{lim} : \mathcal{C}^\mathcal{J} \to \mathcal{C}</math>
which assigns each diagram its limit and each [[natural transformation]] η : ''F'' → ''G'' the unique morphism lim η : lim ''F'' → lim ''G'' commuting with the corresponding universal cones. This functor is [[right adjoint]] to the diagonal functor Δ : ''C'' → ''C''''J''.
This adjunction gives a bijection between the set of all morphisms from ''N'' to lim ''F'' and the set of all cones from ''N'' to ''F''
:<math>\mathrm{Hom}(N,\mathrm{lim}F) \cong \mathrm{Cone}(N,F)</math>
which is natural in the variables ''N'' and ''F''. The counit of this adjunction is simply the universal cone from lim ''F'' to ''F''. If the index category ''J'' is [[connected category|connected]] (and nonempty) then the unit of the adjunction is an isomorphism so that lim is a left inverse of Δ. This fails if ''J'' is not connected. For example, if ''J'' is a discrete category, the components of the unit are the [[diagonal morphism]]s δ : ''N'' → ''N''''J''.

Dually, if every diagram of type ''J'' has a colimit in ''C'' (for ''J'' small) there exists a '''colimit functor'''
:<math>\mathrm{colim} : \mathcal{C}^\mathcal{J} \to \mathcal{C}</math>
which assigns each diagram its colimit. This functor is [[left adjoint]] to the diagonal functor Δ : ''C'' → ''C''''J'', and one has a natural isomorphism
:<math>\mathrm{Hom}(\mathrm{colim}F,N) \cong \mathrm{Cocone}(F,N).</math>
The unit of this adjunction is the universal cocone from ''F'' to colim ''F''. If ''J'' is connected (and nonempty) then the counit is an isomorphism, so that colim is a left inverse of Δ.

Note that both the limit and the colimit functors are [[covariant functor|''covariant'']] functors.

=== As representations of functors ===
One can use [[Hom functor]]s to relate limits and colimits in a category ''C'' to limits in '''Set''', the [[category of sets]]. This follows, in part, from the fact the covariant Hom functor Hom(''N'', –) : ''C'' → '''Set''' [[#Preservation of limits|preserves all limits]] in ''C''. By duality, the contravariant Hom functor must take colimits to limits.

If a diagram ''F'' : ''J'' → ''C'' has a limit in ''C'', denoted by lim ''F'', there is a [[canonical isomorphism]]
:<math>\mathrm{Hom}(N,\mathrm{lim} F)\cong\mathrm{lim}\,\mathrm{Hom}(N,F-)</math>
which is natural in the variable ''N''. Here the functor Hom(''N'', ''F''–) is the composition of the Hom functor Hom(''N'', –) with ''F''. This isomorphism is the unique one which respects the limiting cones.

One can use the above relationship to define the limit of ''F'' in ''C''. The first step is to observe that the limit of the functor Hom(''N'', ''F''–) can be identified with the set of all cones from ''N'' to ''F'':
:<math>\mathrm{lim}\,\mathrm{Hom}(N,F-) = \mathrm{Cone}(N,F).</math>
The limiting cone is given by the family of maps π''X'' : Cone(''N'', ''F'') → Hom(''N'', ''FX'') where π''X''(ψ) = ψ''X''. If one is given an object ''L'' of ''C'' together with a [[natural isomorphism]] Φ : Hom(–, ''L'') → Cone(–, ''F''), the object ''L'' will be a limit of ''F'' with the limiting cone given by Φ''L''(id''L''). In fancy language, this amounts to saying that a limit of ''F'' is a [[representable functor|representation]] of the functor Cone(–, ''F'') : ''C'' → '''Set'''.

Dually, if a diagram ''F'' : ''J'' → ''C'' has a colimit in ''C'', denoted colim ''F'', there is a unique canonical isomorphism
:<math>\mathrm{Hom}(\mathrm{colim} F, N)\cong\mathrm{lim}\,\mathrm{Hom}(F-,N)</math>
which is natural in the variable ''N'' and respects the colimiting cones. Identifying the limit of Hom(''F''–, ''N'') with the set Cocone(''F'', ''N''), this relationship can be used to define the colimit of the diagram ''F'' as a representation of the functor Cocone(''F'', –).

=== Interchange of limits and colimits of sets===
Let ''I'' be a finite category and ''J'' be a small [[filtered category]]. For any [[bifunctor]]

:''F'' : ''I'' × ''J'' → '''Set'''

there is a [[natural isomorphism]]

:<math>\mathrm{colim}_J\,\mathrm{lim}_I F(i, j) \rightarrow \mathrm{lim}_I\,\mathrm{colim}_J F(i, j).</math>

In words, filtered colimits in '''Set''' commute with finite limits.

==Functors and limits==
If ''F'' : ''J'' → ''C'' is a diagram in ''C'' and ''G'' : ''C'' → ''D'' is a [[functor]] then by composition (recall that a diagram is just a functor) one obtains a diagram ''GF'' : ''J'' → ''D''. A natural question is then:
:“How are the limits of ''GF'' related to those of ''F''?”

===Preservation of limits===
A functor ''G'' : ''C'' → ''D'' induces a map from Cone(''F'') to Cone(''GF''): if Ψ is a cone from ''N'' to ''F'' then ''G''Ψ is a cone from ''GN'' to ''GF''. The functor ''G'' is said to '''preserve the limits of ''F''''' if (''GL'', ''G''φ) is a limit of ''GF'' whenever (''L'', φ) is a limit of ''F''. (Note that if the limit of ''F'' does not exist, then ''G'' [[vacuous truth|vacuously]] preserves the limits of ''F''.)

A functor ''G'' is said to '''preserve all limits of type ''J''''' if it preserves the limits of all diagrams ''F'' : ''J'' → ''C''. For example, one can say that ''G'' preserves products, equalizers, pullbacks, etc. A '''continuous functor''' is one that preserves all ''small'' limits.

One can make analogous definitions for colimits. For instance, a functor ''G'' preserves the colimits of ''F'' if ''G''(''L'', φ) is a colimit of ''GF'' whenever (''L'', φ) is a colimit of ''F''. A '''cocontinuous functor''' is one that preserves all ''small'' colimits.

If ''C'' is a [[complete category]], then, by the above existence theorem for limits, a functor ''G'' : ''C'' → ''D'' is continuous if and only if it preserves (small) products and equalizers. Dually, ''G'' is cocontinuous if and only if it preserves (small) coproducts and coequalizers.

An important property of [[adjoint functors]] is that every right adjoint functor is continuous and every left adjoint functor is cocontinuous. Since adjoint functors exist in abundance, this gives numerous examples of continuous and cocontinuous functors.

For a given diagram ''F'' : ''J'' → ''C'' and functor ''G'' : ''C'' → ''D'', if both ''F'' and ''GF'' have specified limits there is a unique canonical morphism
:τ''F'' : ''G'' lim ''F'' → lim ''GF''
which respects the corresponding limit cones. The functor ''G'' preserves the limits of ''F'' if and only this map is an isomorphism. If the categories ''C'' and ''D'' have all limits of type ''J'' then lim is a functor and the morphisms τ''F'' form the components of a [[natural transformation]]
:τ : ''G'' lim → lim ''G''''J''.
The functor ''G'' preserves all limits of type ''J'' if and only if τ is a natural isomorphism. In this sense, the functor ''G'' can be said to ''commute with limits'' ([[up to]] a canonical natural isomorphism).

Preservation of limits and colimits is a concept that only applies to ''[[covariant functor|covariant]]'' functors. For [[contravariant functor]]s the corresponding notions would be a functor that takes colimits to limits, or one that takes limits to colimits.

===Lifting of limits===
A functor ''G'' : ''C'' → ''D'' is said to '''lift limits''' for a diagram ''F'' : ''J'' → ''C'' if whenever (''L'', φ) is a limit of ''GF'' there exists a limit (''L''′, φ′) of ''F'' such that ''G''(''L''′, φ′) = (''L'', φ). A functor ''G'' '''lifts limits of type ''J''''' if it lifts limits for all diagrams of type ''J''. One can therefore talk about lifting products, equalizers, pullbacks, etc. Finally, one says that ''G'' '''lifts limits''' if it lifts all limits. There are dual definitions for the lifting of colimits.

A functor ''G'' '''lifts limits uniquely''' for a diagram ''F'' if there is a unique preimage cone (''L''′, φ′) such that (''L''′, φ′) is a limit of ''F'' and ''G''(''L''′, φ′) = (''L'', φ). One can show that ''G'' lifts limits uniquely if and only if it lifts limits and is [[amnestic functor|amnestic]].

Lifting of limits is clearly related to preservation of limits. If ''G'' lifts limits for a diagram ''F'' and ''GF'' has a limit, then ''F'' also has a limit and ''G'' preserves the limits of ''F''. It follows that:
*If ''G'' lifts limits of all type ''J'' and ''D'' has all limits of type ''J'', then ''C'' also has all limits of type ''J'' and ''G'' preserves these limits.
*If ''G'' lifts all small limits and ''D'' is complete, then ''C'' is also complete and ''G'' is continuous.
The dual statements for colimits are equally valid.

===Creation and reflection of limits===
Let ''F'' : ''J'' → ''C'' be a diagram. A functor ''G'' : ''C'' → ''D'' is said to
*'''create limits''' for ''F'' if whenever (''L'', φ) is a limit of ''GF'' there exists a unique cone (''L''′, φ′) to ''F'' such that ''G''(''L''′, φ′) = (''L'', φ), and furthermore, this cone is a limit of ''F''.
*'''reflect limits''' for ''F'' if each cone to ''F'' whose image under ''G'' is a limit of ''GF'' is already a limit of ''F''.
Dually, one can define creation and reflection of colimits.

The following statements are easily seen to be equivalent:
*The functor ''G'' creates limits.
*The functor ''G'' lifts limits uniquely and reflects limits.
There are examples of functors which lift limits uniquely but neither create nor reflect them.

===Examples===
* For any category ''C'' and object ''A'' of ''C'' the [[Hom functor]] Hom(''A'',–) : ''C'' → '''Set''' preserves all limits in ''C''. In particular, Hom functors are continuous. Hom functors need not preserve colimits.
* Every [[representable functor]] ''C'' → '''Set''' preserves limits (but not necessarily colimits).
* The [[forgetful functor]] ''U'' : '''Grp''' → '''Set''' creates (and preserves) all small limits and [[filtered colimit]]s; however, ''U'' does not preserve coproducts. This situation is typical of algebraic forgetful functors.
* The [[free functor]] ''F'' : '''Set''' → '''Grp''' (which assigns to every set ''S'' the [[free group]] over ''S'') is left adjoint to forgetful functor ''U'' and is, therefore, cocontinuous. This explains why the [[free product]] of two free groups ''G'' and ''H'' is the free group generated by the [[disjoint union]] of the generators of ''G'' and ''H''.
* The inclusion functor '''Ab''' → '''Grp''' creates limits but does not preserve coproducts (the coproduct of two abelian groups being the [[Direct sum of abelian groups|direct sum]]).
* The forgetful functor '''Top''' → '''Set''' lifts limits and colimits uniquely but creates neither.
* Let '''Met'''''c'' be the category of [[metric space]]s with [[continuous function]]s for morphisms. The forgetful functor '''Met'''''c'' → '''Set''' lifts finite limits but does not lift them uniquely.

== A note on terminology ==
Older terminology referred to limits as "inverse limits" or "projective limits," and to colimits as "direct limits" or "inductive limits." This has been the source of a lot of confusion.

There are several ways to remember the modern terminology. First of all,
*cokernels,
*coequalizers, and
*codomains
are types of colimits, whereas
*kernels,
*equalizers, and
*domains
are types of limits. Second, the prefix "co" implies "first variable of the <math>\operatorname{Hom}</math>". Terms like "cohomology" and "cofibration" all have a slightly stronger association with the first variable, i.e., the contravariant variable, of the <math>\operatorname{Hom}</math> bifunctor.

== References ==
*{{cite book | last = Adámek | first = Jiří | coauthors = Horst Herrlich, and George E. Strecker | year = 1990 | url = http://katmat.math.uni-bremen.de/acc/acc.pdf | title = Abstract and Concrete Categories|publisher = John Wiley & Sons | isbn = 0-471-60922-6}}
*{{cite book | first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | series = Graduate Texts in Mathematics '''5''' | edition = 2nd ed. | publisher = Springer | isbn = 0-387-98403-8}}

== External links ==
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page ] which generates examples of limits and colimits in the category of finite sets. Written by [http://www.j-paine.org/ Jocelyn Paine].

[[Category:Limits (category theory)| ]]

[[ca:Límit (teoria de categories)]]
[[de:Limes (Kategorientheorie)]]
[[es:Límite (teoría de categorías)]]
[[ko:극한 (범주론)]]
[[nl:Limiet (categorietheorie)]]
[[pl:Granica i kogranica]]
[[ru:Предел (теория категорий)]]
[[zh:极限 (范畴论)]]

Ignoramus et ignorabimus

2012-11-18T11:36:46Z

Magmalex: /* Hilbert's reaction */

{{italic title}}
[[Image:Bois-Reymond.jpg|thumb|[[Emil du Bois-Reymond]], promulgator of the maxim ''ignoramus et ignorabimus.'']]
The [[Latin]] maxim '''''ignoramus et ignorabimus''''', meaning "we do not know and will not know", stood for a position on the limits of [[scientific knowledge]], in the thought of the nineteenth century. It was given credibility by [[Emil du Bois-Reymond]], a German [[physiologist]], in his ''Über die Grenzen des Naturerkennens'' ("On the limits of our understanding of nature") of 1872.

== Hilbert's reaction ==

On the 8th of September 1930, the [[mathematician]] [[David Hilbert]] pronounced his disagreement in a celebrated address to the Society of German Scientists and Physicians, in [[Königsberg]]:<ref>[[David Hilbert|Hilbert, David]], [http://math.sfsu.edu/smith/Documents/HilbertRadio/HilbertRadio.mp3 audio address], [http://math.sfsu.edu/smith/Documents/HilbertRadio/HilbertRadio.pdf transcription and English translation].</ref>

{{Cquote|We must not believe those, who today, with philosophical bearing and deliberative tone, prophesy the fall of culture and accept the ''ignorabimus''. For us there is no ''ignorabimus'', and in my opinion none whatever in natural science. In opposition to the foolish ''ignorabimus'' our slogan shall be: '''Wir müssen wissen — wir werden wissen!''' ('We must know — we will know!')}}

Already in 1900, at the International Congress of Mathematicians at Paris he said: "In mathematics there is no ''ignorabimus''."<ref>{{Cite journal |author=D. Hilbert |title=Mathematical Problems: Lecture Delivered before the International Congress of Mathematicians at Paris in 1900 |journal=Bulletin of the American Mathematical Society |volume=8 |year=1902 |pages=437-79 |url=http://aleph0.clarku.edu/~djoyce/hilbert/problems.html }}</ref>

Hilbert worked with other [[Formalism (mathematics)|formalist]]s to establish concrete [[foundations of mathematics#foundation crisis|foundations for mathematics]] in the early 20th century. However, [[Gödel's incompleteness theorems]] showed in 1931 that no finite system of [[axiom]]s, if complex enough to express our usual [[arithmetic]], could ever fulfill the goals of [[Hilbert's program]], demonstrating many of Hilbert's aims impossible, and specifying limits on most [[axiomatic system]]s.

[[Image:Hilbert.jpg|thumb|[[David Hilbert]] replied, ''Wir müssen wissen — wir werden wissen!'' (We must know — we will know!)]]

== Seven World Riddles ==
[[Emil du Bois-Reymond]] used ''ignoramus et ignorabimus'' in discussing what he called [[Emil_du_Bois-Reymond#The_Seven_World_Riddles|seven "world riddles"]], in a famous 1880 speech before the [[Prussian Academy of Sciences|Berlin Academy of Sciences]].

He outlined seven "world riddles", of which three, he declared, neither science nor philosophy could ever explain, because they are "[[Transcendence (philosophy)#Kant_.28and_modern_philosophy.29|transcendent]]". Of the riddles, he considered the following transcendental and declared of them ''ignoramus et ignorabimus:''<ref>William E. Leverette Jr., ''E. L. Youmans' Crusade for Scientific Autonomy and Respectability'', American Quarterly, Vol. 17, No. 1. (Spring, 1965), pg. 21.</ref>
"1. the ultimate nature of matter and force,
2. the origin of motion,...
5. the origin of simple [[Sense|sensations]], a quite transcendent question." However, depending on the interpretation of "ultimate nature" and "origin," it is possible to consider some of these as partially or completely solved. For example, the [[sensory system]]s for the traditional senses (sight, hearing, taste, smell, touch) are now mostly understood, including some of the associated neural processing.

== Sociological responses ==
The [[sociologist]] [[Wolf Lepenies]] has discussed the ''ignorabimus'' with a view that du Bois-Reymond was not really retreating in his claims for science and its reach:<ref>{{cite book|last=Lepenies|first=Wolf|authorlink=Wolf Lepenies|title=Between Literature and Science: the Rise of Sociology|year=1988|publisher=Cambridge University Press|location=Cambridge, UK|isbn=0-521-33810-7|page=272}}</ref>

:''— it is in fact an incredibly self-confident support for scientific hubris masked as modesty —''

This is in a discussion of [[Friedrich Wolters]], one of the members of the literary group "[[George-Kreis]]". Lepenies comments that Wolters misunderstood the degree of pessimism being expressed about science, but well understood the implication that scientists themselves could be trusted with self-criticism.

==See also==
*[[Hubris]]
*[[Strong agnosticism]]
*[[Unknown unknown]]
* [[Ignorance management]]
*[[I know that I know nothing]]

==Notes==
{{reflist}}

{{philosophy of science}}

[[Category:Epistemology of science]]
[[Category:Slogans]]
[[Category:Latin words and phrases]]
[[Category:Concepts in epistemology]]

[[cs:Ignoramus et ignorabimus]]
[[de:Ignoramus et ignorabimus]]
[[es:Ignoramus et ignorabimus]]
[[fr:Ignorabimus]]
[[ja:我々は知らない、知ることはないだろう]]
[[pl:Ignoramus et ignorabimus]]
[[pt:Ignoramus et ignorabimus]]
[[ru:Ignoramus et ignorabimus]]
[[sk:Ignoramus et ignorabimus]]
[[tr:Ignoramus et ignorabimus]]
[[uk:Ignorabimus]]

Natural transformation

2012-11-11T23:55:19Z

Magmalex: /* Historical notes */ Inline reference

{{About|natural transformations in category theory|the natural competence of bacteria to take up foreign DNA|Transformation (genetics)}}
{{other uses|Transformation (mathematics) (disambiguation)}}
In [[category theory]], a branch of [[mathematics]], a '''natural transformation''' provides a way of transforming one [[functor]] into another while respecting the internal structure (i.e. the composition of [[morphism]]s) of the categories involved. Hence, a natural transformation can be considered to be a "morphism of functors". Indeed this intuition can be formalized to define so-called [[functor category|functor categories]]. Natural transformations are, after categories and functors, one of the most basic notions of [[category theory]] and consequently appear in the majority of its applications.

==Definition==
If ''F'' and ''G'' are [[functor]]s between the categories ''C'' and ''D'', then a '''natural transformation''' η from ''F'' to ''G'' associates to every object ''X'' in ''C'' a [[morphism]] {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} between objects of ''D'', called the '''component''' of η at ''X'', such that for every morphism {{nobreak|1=''f'' : ''X'' → ''Y'' in ''C''}} we have:

:<math>\eta_Y \circ F(f) = G(f) \circ \eta_X</math>

This equation can conveniently be expressed by the [[commutative diagram]]

[[File:Natural transformation.svg|175px]]

If both ''F'' and ''G'' are [[contravariant functor|contravariant]], the horizontal arrows in this diagram are reversed. If η is a natural transformation from ''F'' to ''G'', we also write {{nobreak|1=η : ''F'' → ''G''}} or {{nobreak|1=η : ''F'' ⇒ ''G''}}. This is also expressed by saying the family of morphisms {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} is '''natural''' in ''X''.

If, for every object ''X'' in ''C'', the morphism η''X'' is an [[isomorphism]] in ''D'', then η is said to be a '''{{visible anchor|natural isomorphism}}''' (or sometimes '''natural equivalence''' or '''isomorphism of functors'''). Two functors ''F'' and ''G'' are called ''naturally isomorphic'' or simply ''isomorphic'' if there exists a natural isomorphism from ''F'' to ''G''.

An '''infranatural transformation''' η from ''F'' to ''G'' is simply a family of morphisms {{nobreak|1=η''X'': ''F''(''X'') → ''G''(''X'')}}. Thus a natural transformation is an infranatural transformation for which {{nobreak|1=η''Y'' ∘ ''F''(''f'') = ''G''(''f'') ∘ η''X''}} for every morphism {{nobreak|1=''f'' : ''X'' → ''Y''}}. The '''naturalizer''' of η, nat(η), is the largest [[subcategory]] of ''C'' containing all the objects of ''C'' on which η restricts to a natural transformation.

==Examples==
===Opposite group===
{{details|Opposite group}}
Statements such as
:"Every group is naturally isomorphic to its [[opposite group]]"
abound in modern mathematics. We will now give the precise meaning of this statement as well as its proof. Consider the category '''Grp''' of all [[group (mathematics)|group]]s with [[group homomorphism]]s as morphisms. If (''G'',*) is a group, we define its opposite group (''G''op,*op) as follows: ''G''op is the same set as ''G'', and the operation *op is defined by {{nobreak|1=''a'' *op ''b'' = ''b'' * ''a''}}. All multiplications in ''G''op are thus "turned around". Forming the [[Opposite category|opposite]] group becomes a (covariant!) functor from '''Grp''' to '''Grp''' if we define {{nobreak|1=''f''op = ''f''}} for any group homomorphism {{nobreak|1=''f'': ''G'' → ''H''}}. Note that ''f''op is indeed a group homomorphism from ''G''op to ''H''op:
:''f''op(''a'' *op ''b'') = ''f''(''b'' * ''a'') = ''f''(''b'') * ''f''(''a'') = ''f''op(''a'') *op ''f''op(''b'').
The content of the above statement is:
:"The identity functor {{nobreak|1=Id'''Grp''' : '''Grp''' → '''Grp'''}} is naturally isomorphic to the opposite functor {{nobreak|1=op : '''Grp''' → '''Grp'''}}."
To prove this, we need to provide isomorphisms {{nobreak|1=η''G'' : ''G'' → ''G''op}} for every group ''G'', such that the above diagram commutes. Set {{nobreak|1=η''G''(''a'') = ''a''−1}}. The formulas {{nobreak|1=(''ab'')−1 = ''b''−1 ''a''−1}} and {{nobreak|1=(''a''−1)−1 = ''a''}} show that η''G'' is a group homomorphism which is its own inverse. To prove the naturality, we start with a group homomorphism {{nobreak|1=''f'' : ''G'' → ''H''}} and show {{nobreak|1=η''H'' ∘ ''f'' = ''f''op ∘ η''G''}}, i.e. {{nobreak|1=(''f''(''a''))−1 = ''f''op(''a''−1)}} for all ''a'' in ''G''. This is true since {{nobreak|1=''f''op = ''f''}} and every group homomorphism has the property {{nobreak|1=(''f''(''a''))−1 = ''f''(''a''−1)}}.

===Double dual of a finite dimensional vector space===
If ''K'' is a [[field (mathematics)|field]], then for every [[vector space]] ''V'' over ''K'' we have a "natural" [[injective]] [[linear map]] {{nobreak|1=''V'' → ''V''**}} from the vector space into its [[double dual]]. These maps are "natural" in the following sense: the double dual operation is a functor, and the maps are the components of a natural transformation from the identity functor to the double dual functor.

===Counterexample: dual of a finite-dimensional vector space===
Every finite-dimensional vector space is isomorphic to its dual space, but this isomorphism relies on an arbitrary choice of isomorphism (for example, via choosing a basis and then taking the isomorphism sending this basis to the corresponding [[dual basis]]). There is in general no natural isomorphism between a finite-dimensional vector space and its dual space.<ref>{{harv|MacLane|Birkhoff|1999|loc=§VI.4}}</ref> However, related categories (with additional structure and restrictions on the maps) do have a natural isomorphism, as described below.

The dual space of a finite-dimensional vector space is again a finite-dimensional vector space of the same dimension, and these are thus isomorphic, since dimension is the only invariant of finite-dimensional vector spaces over a given field. However, in the absence of additional data (such as a basis), there is no given map from a space to its dual, and thus such an isomorphism requires a choice, and is "not natural". On the category of finite-dimensional vector spaces and linear maps, one can define an infranatural isomorphism from vector spaces to their dual by choosing an isomorphism for each space (say, by choosing a basis for every vector space and taking the corresponding isomorphism), but this will not define a natural transformation. Intuitively this is because it required a choice, rigorously because ''any'' such choice of isomorphisms will not commute with ''all'' linear maps; see {{harv|MacLane|Birkhoff|1999|loc=§VI.4}} for detailed discussion.

Starting from finite-dimensional vector spaces (as objects) and the dual functor, one can define a natural isomorphism, but this requires first adding additional structure, then restricting the maps from "all linear maps" to "linear maps that respect this structure". Explicitly, for each vector space, require that it come with the data of an isomorphism to its dual, <math>\eta_V\colon V \to V^*.</math> In other words, take as objects vector spaces with a [[nondegenerate bilinear form]] <math>b_V\colon V \times V \to K.</math> This defines an infranatural isomorphism (isomorphism for each object). One then restricts the maps to only those maps that commute with these isomorphism (restricts to the naturalizer of ''η''), in other words, restrict to the maps that do not change the bilinear form: <math>b(T(v),T(w))=b(v,w).</math> The resulting category, with objects finite-dimensional vector spaces with a nondegenerate bilinear form, and maps linear transforms that respect the bilinear form, by construction has a natural isomorphism from the identity to the dual (each space has an isomorphism to its dual, and the maps in the category are required to commute). Viewed in this light, this construction (add transforms for each object, restrict maps to commute with these) is completely general, and does not depend on any particular properties of vector spaces.

In this category (finite-dimensional vector spaces with a nondegenerate bilinear form, maps linear transforms that respect the bilinear form), the dual of a map between vector spaces can be identified as a [[transpose]]. Often for reasons of geometric interest this is specialized to a subcategory, by requiring that the nondegenerate bilinear forms have additional properties, such as being symmetric ([[orthogonal matrices]]), symmetric and positive definite ([[inner product space]]), symmetric sesquilinear ([[Hermitian space]]s), skew-symmetric and totally isotropic ([[symplectic vector space]]), etc. – in all these categories a vector space is naturally identified with its dual, by the nondegenerate bilinear form.

===Tensor-hom adjunction===
{{see|Tensor-hom adjunction|Adjoint functors}}
Consider the [[category of abelian groups|category '''Ab''' of abelian groups and group homomorphisms]]. For all abelian groups ''X'', ''Y'' and ''Z'' we have a group isomorphism
:{{nobreak|1=Hom(''X'' {{otimes}} ''Y'', ''Z'') → Hom(''X'', Hom(''Y'', ''Z''))}}.
These isomorphisms are "natural" in the sense that they define a natural transformation between the two involved functors {{nobreak|1='''Ab''' × '''Ab'''op × '''Ab'''op → '''Ab'''}}.

This is formally the [[tensor-hom adjunction]], and is an archetypal example of a pair of [[adjoint functors]]. Natural transformations arise frequently in conjunction with adjoint functors, and indeed, adjoint functors are defined by a certain natural isomorphism. Additionally, every pair of adjoint functors comes equipped with two natural transformations (generally not isomorphisms) called the ''unit'' and ''counit''.

== Unnatural isomorphism ==
The notion of a natural transformation is categorical, and states (informally) that a particular map between functors can be done consistently over an entire category. Informally, a particular map (esp. an isomorphism) between individual objects (not entire categories) is referred to as a "natural isomorphism", meaning implicitly that it is actually defined on the entire category, and defines a natural transformation of functors; formalizing this intuition was a motivating factor in the development of category theory. Conversely, a particular map between particular objects may be called an '''unnatural isomorphism''' (or "this isomorphism is not natural") if the map cannot be extended to a natural transformation on the entire category. Given an object ''X,'' a functor ''G'' (taking for simplicity the first functor to be the identity) and an isomorphism <math>\eta\colon X \to G(X),</math> proof of unnaturality is most easily shown by giving an automorphism <math>A\colon X \to X</math> that does not commute with this isomorphism (so <math>\eta \circ A \neq G(A) \circ \eta</math>). More strongly, if one wishes to prove that ''X'' and ''G''(''X'') are not naturally isomorphic, without reference to a particular isomorphism, this requires showing that for ''any'' isomorphism ''η,'' there is some ''A'' with which it does not commute; in some cases a single automorphism ''A'' works for all candidate isomorphisms ''η,'' while in other cases one must show how to construct a different ''A''''η'' for each isomorphism. The maps of the category play a crucial role – any infranatural transform is natural if the only maps are the identity map, for instance.

This is similar (but more categorical) to concepts in group theory or module theory, where a given decomposition of an object into a direct sum is "not natural", or rather "not unique", as automorphisms exist that do not preserve the direct sum decomposition – see [[Structure theorem for finitely generated modules over a principal ideal domain#Uniqueness]] for example.

Some authors distinguish notationally, using ≅ for a natural isomorphism and ≈ for an unnatural isomorphism, reserving = for equality (usually equality of maps).

== Operations with natural transformations ==
If {{nobreak|1=η : ''F'' → ''G''}} and {{nobreak|1=ε : ''G'' → ''H''}} are natural transformations between functors {{nobreak|1=''F'',''G'',''H'' : ''C'' → ''D''}}, then we can compose them to get a natural transformation {{nobreak|1=εη : ''F'' → ''H''}}. This is done componentwise: {{nobreak|1=(εη)''X'' = ε''X''η''X''}}. This "vertical composition" of natural transformation is [[associative]] and has an identity, and allows one to consider the collection of all functors {{nobreak|1=''C'' → ''D''}} itself as a category (see below under [[#Functor categories|Functor categories]]).

Natural transformations also have a "horizontal composition". If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}} and {{nobreak|1=ε : ''J'' → ''K''}} is a natural transformation between functors {{nobreak|1=''J'',''K'' : ''D'' → ''E''}}, then the composition of functors allows a composition of natural transformations {{nobreak|1=ηε : ''JF'' → ''KG''}}. This operation is also associative with identity, and the identity coincides with that for vertical composition. The two operations are related by an identity which exchanges vertical composition with horizontal composition.

If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}}, and {{nobreak|1=''H'' : ''D'' → ''E''}} is another functor, then we can form the natural transformation {{nobreak|1=''H''η : ''HF'' → ''HG''}} by defining

:<math> (H \eta)_X = H \eta_X. </math>

If on the other hand {{nobreak|1=''K'' : ''B'' → ''C''}} is a functor, the natural transformation {{nobreak|1=η''K'' : ''FK'' → ''GK''}} is defined by

:<math> (\eta K)_X = \eta_{K(X)}.\, </math>

==Functor categories==

{{Main|Functor category}}
If ''C'' is any category and ''I'' is a [[small category]], we can form the [[functor category]] ''CI'' having as objects all functors from ''I'' to ''C'' and as morphisms the natural transformations between those functors. This forms a category since for any functor ''F'' there is an identity natural transformation {{nobreak|1=1''F'' : ''F'' → ''F''}} (which assigns to every object ''X'' the identity morphism on ''F''(''X'')) and the composition of two natural transformations (the "vertical composition" above) is again a natural transformation.

The [[isomorphism]]s in ''CI'' are precisely the natural isomorphisms. That is, a natural transformation {{nobreak|1=η : ''F'' → ''G''}} is a natural isomorphism if and only if there exists a natural transformation {{nobreak|1=ε : ''G'' → ''F''}} such that {{nobreak|1=ηε = 1''G''}} and {{nobreak|1=εη = 1''F''}}.

The functor category ''CI'' is especially useful if ''I'' arises from a [[directed graph]]. For instance, if ''I'' is the category of the directed graph {{nobreak|1=• → •}}, then ''CI'' has as objects the morphisms of ''C'', and a morphism between {{nobreak|1=φ : ''U'' → ''V''}} and {{nobreak|1=ψ : ''X'' → ''Y''}} in ''CI'' is a pair of morphisms {{nobreak|1=''f'' : ''U'' → ''X''}} and {{nobreak|1=''g'' : ''V'' → ''Y''}} in ''C'' such that the "square commutes", i.e. {{nobreak|1=ψ ''f'' = ''g'' φ}}.

More generally, one can build the [[2-category]] '''Cat''' whose
* 0-cells (objects) are the small categories,
* 1-cells (arrows) between two objects <math>C</math> and <math>D</math> are the functors from <math>C</math> to <math>D</math>,
* 2-cells between two 1-cells (functors) <math>F:C\to D</math> and <math>G:C\to D</math> are the natural transformations from <math>F</math> to <math>G</math>.
The horizontal and vertical compositions are the compositions between natural transformations described previously. A functor category <math>C^I</math> is then simply a hom-category in this category (smallness issues aside).

==Yoneda lemma==

{{Main|Yoneda lemma}}
If ''X'' is an object of a [[locally small category]] ''C'', then the assignment {{nobreak|1=''Y'' {{mapsto}} Hom''C''(''X'', ''Y'')}} defines a covariant functor {{nobreak|1=''F''''X'' : ''C'' → '''Set'''}}. This functor is called ''[[representable functor|representable]]'' (more generally, a representable functor is any functor naturally isomorphic to this functor for an appropriate choice of ''X''). The natural transformations from a representable functor to an arbitrary functor {{nobreak|1=''F'' : ''C'' → '''Set'''}} are completely known and easy to describe; this is the content of the [[Yoneda lemma]].

== Historical notes ==

[[Saunders Mac Lane]], one of the founders of category theory, is said to have remarked, "I didn't invent categories to study functors; I invented them to study natural transformations."<ref>{{harv|Mac Lane|1998|loc=§I.4}}</ref> Just as the study of [[group (mathematics)|groups]] is not complete without a study of [[group homomorphism|homomorphisms]], so the study of categories is not complete without the study of [[functor]]s. The reason for Mac Lane's comment is that the study of functors is itself not complete without the study of natural transformations.

The context of Mac Lane's remark was the axiomatic theory of [[homology (mathematics)|homology]]. Different ways of constructing homology could be shown to coincide: for example in the case of a [[simplicial complex]] the groups defined directly would be isomorphic to those of the singular theory. What cannot easily be expressed without the language of natural transformations is how homology groups are compatible with morphisms between objects, and how two equivalent homology theories not only have the same homology groups, but also the same morphisms between those groups.

==Symbols used==
* {{unichar|2297|CIRCLED TIMES|html=}}

== See also ==
* [[Extranatural transformation]]

== References ==
{{Portal|Category theory}}
{{reflist}}
{{refbegin}}
*{{citation| first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | series = Graduate Texts in Mathematics '''5''' | edition = 2nd | publisher = Springer-Verlag | isbn = 0-387-98403-8}}
* {{citation|first1=Saunders|last1=MacLane|authorlink1=Saunders MacLane|first2=Garrett|last2=Birkhoff|authorlink2=Garrett Birkhoff|title=Algebra|edition=3rd|publisher=AMS Chelsea Publishing|year=1999|isbn=0-8218-1646-2}}.
{{refend}}

==External links==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* [[André Joyal]], [http://ncatlab.org/nlab CatLab], a wiki project dedicated to the exposition of categorical mathematics
* {{cite web | first = Chris | last = Hillman | title = A Categorical Primer | id = {{citeseerx|10.1.1.24.3264}} | postscript = : }} formal introduction to category theory.
* J. Adamek, H. Herrlich, G. Stecker, [http://katmat.math.uni-bremen.de/acc/acc.pdf Abstract and Concrete Categories-The Joy of Cats]
* [[Stanford Encyclopedia of Philosophy]]: "[http://plato.stanford.edu/entries/category-theory/ Category Theory]" -- by Jean-Pierre Marquis. Extensive bibliography.
* [http://www.mta.ca/~cat-dist/ List of academic conferences on category theory]
* Baez, John, 1996,"[http://math.ucr.edu/home/baez/week73.html The Tale of ''n''-categories.]" An informal introduction to higher order categories.
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Functors]]

[[es:Transformación natural]]
[[fr:Transformation naturelle]]
[[it:Trasformazione naturale]]
[[nl:Natuurlijke transformatie]]
[[ja:自然変換]]
[[pl:Transformacja naturalna]]
[[ru:Естественное преобразование]]
[[sv:Naturlig transformation]]

Natural transformation

2012-11-11T23:52:08Z

Magmalex: /* Historical notes */ Inline reference

{{About|natural transformations in category theory|the natural competence of bacteria to take up foreign DNA|Transformation (genetics)}}
{{other uses|Transformation (mathematics) (disambiguation)}}
In [[category theory]], a branch of [[mathematics]], a '''natural transformation''' provides a way of transforming one [[functor]] into another while respecting the internal structure (i.e. the composition of [[morphism]]s) of the categories involved. Hence, a natural transformation can be considered to be a "morphism of functors". Indeed this intuition can be formalized to define so-called [[functor category|functor categories]]. Natural transformations are, after categories and functors, one of the most basic notions of [[category theory]] and consequently appear in the majority of its applications.

==Definition==
If ''F'' and ''G'' are [[functor]]s between the categories ''C'' and ''D'', then a '''natural transformation''' η from ''F'' to ''G'' associates to every object ''X'' in ''C'' a [[morphism]] {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} between objects of ''D'', called the '''component''' of η at ''X'', such that for every morphism {{nobreak|1=''f'' : ''X'' → ''Y'' in ''C''}} we have:

:<math>\eta_Y \circ F(f) = G(f) \circ \eta_X</math>

This equation can conveniently be expressed by the [[commutative diagram]]

[[File:Natural transformation.svg|175px]]

If both ''F'' and ''G'' are [[contravariant functor|contravariant]], the horizontal arrows in this diagram are reversed. If η is a natural transformation from ''F'' to ''G'', we also write {{nobreak|1=η : ''F'' → ''G''}} or {{nobreak|1=η : ''F'' ⇒ ''G''}}. This is also expressed by saying the family of morphisms {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} is '''natural''' in ''X''.

If, for every object ''X'' in ''C'', the morphism η''X'' is an [[isomorphism]] in ''D'', then η is said to be a '''{{visible anchor|natural isomorphism}}''' (or sometimes '''natural equivalence''' or '''isomorphism of functors'''). Two functors ''F'' and ''G'' are called ''naturally isomorphic'' or simply ''isomorphic'' if there exists a natural isomorphism from ''F'' to ''G''.

An '''infranatural transformation''' η from ''F'' to ''G'' is simply a family of morphisms {{nobreak|1=η''X'': ''F''(''X'') → ''G''(''X'')}}. Thus a natural transformation is an infranatural transformation for which {{nobreak|1=η''Y'' ∘ ''F''(''f'') = ''G''(''f'') ∘ η''X''}} for every morphism {{nobreak|1=''f'' : ''X'' → ''Y''}}. The '''naturalizer''' of η, nat(η), is the largest [[subcategory]] of ''C'' containing all the objects of ''C'' on which η restricts to a natural transformation.

==Examples==
===Opposite group===
{{details|Opposite group}}
Statements such as
:"Every group is naturally isomorphic to its [[opposite group]]"
abound in modern mathematics. We will now give the precise meaning of this statement as well as its proof. Consider the category '''Grp''' of all [[group (mathematics)|group]]s with [[group homomorphism]]s as morphisms. If (''G'',*) is a group, we define its opposite group (''G''op,*op) as follows: ''G''op is the same set as ''G'', and the operation *op is defined by {{nobreak|1=''a'' *op ''b'' = ''b'' * ''a''}}. All multiplications in ''G''op are thus "turned around". Forming the [[Opposite category|opposite]] group becomes a (covariant!) functor from '''Grp''' to '''Grp''' if we define {{nobreak|1=''f''op = ''f''}} for any group homomorphism {{nobreak|1=''f'': ''G'' → ''H''}}. Note that ''f''op is indeed a group homomorphism from ''G''op to ''H''op:
:''f''op(''a'' *op ''b'') = ''f''(''b'' * ''a'') = ''f''(''b'') * ''f''(''a'') = ''f''op(''a'') *op ''f''op(''b'').
The content of the above statement is:
:"The identity functor {{nobreak|1=Id'''Grp''' : '''Grp''' → '''Grp'''}} is naturally isomorphic to the opposite functor {{nobreak|1=op : '''Grp''' → '''Grp'''}}."
To prove this, we need to provide isomorphisms {{nobreak|1=η''G'' : ''G'' → ''G''op}} for every group ''G'', such that the above diagram commutes. Set {{nobreak|1=η''G''(''a'') = ''a''−1}}. The formulas {{nobreak|1=(''ab'')−1 = ''b''−1 ''a''−1}} and {{nobreak|1=(''a''−1)−1 = ''a''}} show that η''G'' is a group homomorphism which is its own inverse. To prove the naturality, we start with a group homomorphism {{nobreak|1=''f'' : ''G'' → ''H''}} and show {{nobreak|1=η''H'' ∘ ''f'' = ''f''op ∘ η''G''}}, i.e. {{nobreak|1=(''f''(''a''))−1 = ''f''op(''a''−1)}} for all ''a'' in ''G''. This is true since {{nobreak|1=''f''op = ''f''}} and every group homomorphism has the property {{nobreak|1=(''f''(''a''))−1 = ''f''(''a''−1)}}.

===Double dual of a finite dimensional vector space===
If ''K'' is a [[field (mathematics)|field]], then for every [[vector space]] ''V'' over ''K'' we have a "natural" [[injective]] [[linear map]] {{nobreak|1=''V'' → ''V''**}} from the vector space into its [[double dual]]. These maps are "natural" in the following sense: the double dual operation is a functor, and the maps are the components of a natural transformation from the identity functor to the double dual functor.

===Counterexample: dual of a finite-dimensional vector space===
Every finite-dimensional vector space is isomorphic to its dual space, but this isomorphism relies on an arbitrary choice of isomorphism (for example, via choosing a basis and then taking the isomorphism sending this basis to the corresponding [[dual basis]]). There is in general no natural isomorphism between a finite-dimensional vector space and its dual space.<ref>{{harv|MacLane|Birkhoff|1999|loc=§VI.4}}</ref> However, related categories (with additional structure and restrictions on the maps) do have a natural isomorphism, as described below.

The dual space of a finite-dimensional vector space is again a finite-dimensional vector space of the same dimension, and these are thus isomorphic, since dimension is the only invariant of finite-dimensional vector spaces over a given field. However, in the absence of additional data (such as a basis), there is no given map from a space to its dual, and thus such an isomorphism requires a choice, and is "not natural". On the category of finite-dimensional vector spaces and linear maps, one can define an infranatural isomorphism from vector spaces to their dual by choosing an isomorphism for each space (say, by choosing a basis for every vector space and taking the corresponding isomorphism), but this will not define a natural transformation. Intuitively this is because it required a choice, rigorously because ''any'' such choice of isomorphisms will not commute with ''all'' linear maps; see {{harv|MacLane|Birkhoff|1999|loc=§VI.4}} for detailed discussion.

Starting from finite-dimensional vector spaces (as objects) and the dual functor, one can define a natural isomorphism, but this requires first adding additional structure, then restricting the maps from "all linear maps" to "linear maps that respect this structure". Explicitly, for each vector space, require that it come with the data of an isomorphism to its dual, <math>\eta_V\colon V \to V^*.</math> In other words, take as objects vector spaces with a [[nondegenerate bilinear form]] <math>b_V\colon V \times V \to K.</math> This defines an infranatural isomorphism (isomorphism for each object). One then restricts the maps to only those maps that commute with these isomorphism (restricts to the naturalizer of ''η''), in other words, restrict to the maps that do not change the bilinear form: <math>b(T(v),T(w))=b(v,w).</math> The resulting category, with objects finite-dimensional vector spaces with a nondegenerate bilinear form, and maps linear transforms that respect the bilinear form, by construction has a natural isomorphism from the identity to the dual (each space has an isomorphism to its dual, and the maps in the category are required to commute). Viewed in this light, this construction (add transforms for each object, restrict maps to commute with these) is completely general, and does not depend on any particular properties of vector spaces.

In this category (finite-dimensional vector spaces with a nondegenerate bilinear form, maps linear transforms that respect the bilinear form), the dual of a map between vector spaces can be identified as a [[transpose]]. Often for reasons of geometric interest this is specialized to a subcategory, by requiring that the nondegenerate bilinear forms have additional properties, such as being symmetric ([[orthogonal matrices]]), symmetric and positive definite ([[inner product space]]), symmetric sesquilinear ([[Hermitian space]]s), skew-symmetric and totally isotropic ([[symplectic vector space]]), etc. – in all these categories a vector space is naturally identified with its dual, by the nondegenerate bilinear form.

===Tensor-hom adjunction===
{{see|Tensor-hom adjunction|Adjoint functors}}
Consider the [[category of abelian groups|category '''Ab''' of abelian groups and group homomorphisms]]. For all abelian groups ''X'', ''Y'' and ''Z'' we have a group isomorphism
:{{nobreak|1=Hom(''X'' {{otimes}} ''Y'', ''Z'') → Hom(''X'', Hom(''Y'', ''Z''))}}.
These isomorphisms are "natural" in the sense that they define a natural transformation between the two involved functors {{nobreak|1='''Ab''' × '''Ab'''op × '''Ab'''op → '''Ab'''}}.

This is formally the [[tensor-hom adjunction]], and is an archetypal example of a pair of [[adjoint functors]]. Natural transformations arise frequently in conjunction with adjoint functors, and indeed, adjoint functors are defined by a certain natural isomorphism. Additionally, every pair of adjoint functors comes equipped with two natural transformations (generally not isomorphisms) called the ''unit'' and ''counit''.

== Unnatural isomorphism ==
The notion of a natural transformation is categorical, and states (informally) that a particular map between functors can be done consistently over an entire category. Informally, a particular map (esp. an isomorphism) between individual objects (not entire categories) is referred to as a "natural isomorphism", meaning implicitly that it is actually defined on the entire category, and defines a natural transformation of functors; formalizing this intuition was a motivating factor in the development of category theory. Conversely, a particular map between particular objects may be called an '''unnatural isomorphism''' (or "this isomorphism is not natural") if the map cannot be extended to a natural transformation on the entire category. Given an object ''X,'' a functor ''G'' (taking for simplicity the first functor to be the identity) and an isomorphism <math>\eta\colon X \to G(X),</math> proof of unnaturality is most easily shown by giving an automorphism <math>A\colon X \to X</math> that does not commute with this isomorphism (so <math>\eta \circ A \neq G(A) \circ \eta</math>). More strongly, if one wishes to prove that ''X'' and ''G''(''X'') are not naturally isomorphic, without reference to a particular isomorphism, this requires showing that for ''any'' isomorphism ''η,'' there is some ''A'' with which it does not commute; in some cases a single automorphism ''A'' works for all candidate isomorphisms ''η,'' while in other cases one must show how to construct a different ''A''''η'' for each isomorphism. The maps of the category play a crucial role – any infranatural transform is natural if the only maps are the identity map, for instance.

This is similar (but more categorical) to concepts in group theory or module theory, where a given decomposition of an object into a direct sum is "not natural", or rather "not unique", as automorphisms exist that do not preserve the direct sum decomposition – see [[Structure theorem for finitely generated modules over a principal ideal domain#Uniqueness]] for example.

Some authors distinguish notationally, using ≅ for a natural isomorphism and ≈ for an unnatural isomorphism, reserving = for equality (usually equality of maps).

== Operations with natural transformations ==
If {{nobreak|1=η : ''F'' → ''G''}} and {{nobreak|1=ε : ''G'' → ''H''}} are natural transformations between functors {{nobreak|1=''F'',''G'',''H'' : ''C'' → ''D''}}, then we can compose them to get a natural transformation {{nobreak|1=εη : ''F'' → ''H''}}. This is done componentwise: {{nobreak|1=(εη)''X'' = ε''X''η''X''}}. This "vertical composition" of natural transformation is [[associative]] and has an identity, and allows one to consider the collection of all functors {{nobreak|1=''C'' → ''D''}} itself as a category (see below under [[#Functor categories|Functor categories]]).

Natural transformations also have a "horizontal composition". If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}} and {{nobreak|1=ε : ''J'' → ''K''}} is a natural transformation between functors {{nobreak|1=''J'',''K'' : ''D'' → ''E''}}, then the composition of functors allows a composition of natural transformations {{nobreak|1=ηε : ''JF'' → ''KG''}}. This operation is also associative with identity, and the identity coincides with that for vertical composition. The two operations are related by an identity which exchanges vertical composition with horizontal composition.

If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}}, and {{nobreak|1=''H'' : ''D'' → ''E''}} is another functor, then we can form the natural transformation {{nobreak|1=''H''η : ''HF'' → ''HG''}} by defining

:<math> (H \eta)_X = H \eta_X. </math>

If on the other hand {{nobreak|1=''K'' : ''B'' → ''C''}} is a functor, the natural transformation {{nobreak|1=η''K'' : ''FK'' → ''GK''}} is defined by

:<math> (\eta K)_X = \eta_{K(X)}.\, </math>

==Functor categories==

{{Main|Functor category}}
If ''C'' is any category and ''I'' is a [[small category]], we can form the [[functor category]] ''CI'' having as objects all functors from ''I'' to ''C'' and as morphisms the natural transformations between those functors. This forms a category since for any functor ''F'' there is an identity natural transformation {{nobreak|1=1''F'' : ''F'' → ''F''}} (which assigns to every object ''X'' the identity morphism on ''F''(''X'')) and the composition of two natural transformations (the "vertical composition" above) is again a natural transformation.

The [[isomorphism]]s in ''CI'' are precisely the natural isomorphisms. That is, a natural transformation {{nobreak|1=η : ''F'' → ''G''}} is a natural isomorphism if and only if there exists a natural transformation {{nobreak|1=ε : ''G'' → ''F''}} such that {{nobreak|1=ηε = 1''G''}} and {{nobreak|1=εη = 1''F''}}.

The functor category ''CI'' is especially useful if ''I'' arises from a [[directed graph]]. For instance, if ''I'' is the category of the directed graph {{nobreak|1=• → •}}, then ''CI'' has as objects the morphisms of ''C'', and a morphism between {{nobreak|1=φ : ''U'' → ''V''}} and {{nobreak|1=ψ : ''X'' → ''Y''}} in ''CI'' is a pair of morphisms {{nobreak|1=''f'' : ''U'' → ''X''}} and {{nobreak|1=''g'' : ''V'' → ''Y''}} in ''C'' such that the "square commutes", i.e. {{nobreak|1=ψ ''f'' = ''g'' φ}}.

More generally, one can build the [[2-category]] '''Cat''' whose
* 0-cells (objects) are the small categories,
* 1-cells (arrows) between two objects <math>C</math> and <math>D</math> are the functors from <math>C</math> to <math>D</math>,
* 2-cells between two 1-cells (functors) <math>F:C\to D</math> and <math>G:C\to D</math> are the natural transformations from <math>F</math> to <math>G</math>.
The horizontal and vertical compositions are the compositions between natural transformations described previously. A functor category <math>C^I</math> is then simply a hom-category in this category (smallness issues aside).

==Yoneda lemma==

{{Main|Yoneda lemma}}
If ''X'' is an object of a [[locally small category]] ''C'', then the assignment {{nobreak|1=''Y'' {{mapsto}} Hom''C''(''X'', ''Y'')}} defines a covariant functor {{nobreak|1=''F''''X'' : ''C'' → '''Set'''}}. This functor is called ''[[representable functor|representable]]'' (more generally, a representable functor is any functor naturally isomorphic to this functor for an appropriate choice of ''X''). The natural transformations from a representable functor to an arbitrary functor {{nobreak|1=''F'' : ''C'' → '''Set'''}} are completely known and easy to describe; this is the content of the [[Yoneda lemma]].

== Historical notes ==
{{Unreferenced section|date=October 2008}}
[[Saunders Mac Lane]], one of the founders of category theory, is said to have remarked, "I didn't invent categories to study functors; I invented them to study natural transformations."<ref>{{harv|Mac Lane|1998|§I.4}}</ref> Just as the study of [[group (mathematics)|groups]] is not complete without a study of [[group homomorphism|homomorphisms]], so the study of categories is not complete without the study of [[functor]]s. The reason for Mac Lane's comment is that the study of functors is itself not complete without the study of natural transformations.

The context of Mac Lane's remark was the axiomatic theory of [[homology (mathematics)|homology]]. Different ways of constructing homology could be shown to coincide: for example in the case of a [[simplicial complex]] the groups defined directly would be isomorphic to those of the singular theory. What cannot easily be expressed without the language of natural transformations is how homology groups are compatible with morphisms between objects, and how two equivalent homology theories not only have the same homology groups, but also the same morphisms between those groups.

==Symbols used==
* {{unichar|2297|CIRCLED TIMES|html=}}

== See also ==
* [[Extranatural transformation]]

== References ==
{{Portal|Category theory}}
{{reflist}}
{{refbegin}}
*{{citation| first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | series = Graduate Texts in Mathematics '''5''' | edition = 2nd | publisher = Springer-Verlag | isbn = 0-387-98403-8}}
* {{citation|first1=Saunders|last1=MacLane|authorlink1=Saunders MacLane|first2=Garrett|last2=Birkhoff|authorlink2=Garrett Birkhoff|title=Algebra|edition=3rd|publisher=AMS Chelsea Publishing|year=1999|isbn=0-8218-1646-2}}.
{{refend}}

==External links==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* [[André Joyal]], [http://ncatlab.org/nlab CatLab], a wiki project dedicated to the exposition of categorical mathematics
* {{cite web | first = Chris | last = Hillman | title = A Categorical Primer | id = {{citeseerx|10.1.1.24.3264}} | postscript = : }} formal introduction to category theory.
* J. Adamek, H. Herrlich, G. Stecker, [http://katmat.math.uni-bremen.de/acc/acc.pdf Abstract and Concrete Categories-The Joy of Cats]
* [[Stanford Encyclopedia of Philosophy]]: "[http://plato.stanford.edu/entries/category-theory/ Category Theory]" -- by Jean-Pierre Marquis. Extensive bibliography.
* [http://www.mta.ca/~cat-dist/ List of academic conferences on category theory]
* Baez, John, 1996,"[http://math.ucr.edu/home/baez/week73.html The Tale of ''n''-categories.]" An informal introduction to higher order categories.
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Functors]]

[[es:Transformación natural]]
[[fr:Transformation naturelle]]
[[it:Trasformazione naturale]]
[[nl:Natuurlijke transformatie]]
[[ja:自然変換]]
[[pl:Transformacja naturalna]]
[[ru:Естественное преобразование]]
[[sv:Naturlig transformation]]

Natural transformation

2012-11-11T23:43:47Z

Magmalex: /* Historical notes */

{{About|natural transformations in category theory|the natural competence of bacteria to take up foreign DNA|Transformation (genetics)}}
{{other uses|Transformation (mathematics) (disambiguation)}}
In [[category theory]], a branch of [[mathematics]], a '''natural transformation''' provides a way of transforming one [[functor]] into another while respecting the internal structure (i.e. the composition of [[morphism]]s) of the categories involved. Hence, a natural transformation can be considered to be a "morphism of functors". Indeed this intuition can be formalized to define so-called [[functor category|functor categories]]. Natural transformations are, after categories and functors, one of the most basic notions of [[category theory]] and consequently appear in the majority of its applications.

==Definition==
If ''F'' and ''G'' are [[functor]]s between the categories ''C'' and ''D'', then a '''natural transformation''' η from ''F'' to ''G'' associates to every object ''X'' in ''C'' a [[morphism]] {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} between objects of ''D'', called the '''component''' of η at ''X'', such that for every morphism {{nobreak|1=''f'' : ''X'' → ''Y'' in ''C''}} we have:

:<math>\eta_Y \circ F(f) = G(f) \circ \eta_X</math>

This equation can conveniently be expressed by the [[commutative diagram]]

[[File:Natural transformation.svg|175px]]

If both ''F'' and ''G'' are [[contravariant functor|contravariant]], the horizontal arrows in this diagram are reversed. If η is a natural transformation from ''F'' to ''G'', we also write {{nobreak|1=η : ''F'' → ''G''}} or {{nobreak|1=η : ''F'' ⇒ ''G''}}. This is also expressed by saying the family of morphisms {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} is '''natural''' in ''X''.

If, for every object ''X'' in ''C'', the morphism η''X'' is an [[isomorphism]] in ''D'', then η is said to be a '''{{visible anchor|natural isomorphism}}''' (or sometimes '''natural equivalence''' or '''isomorphism of functors'''). Two functors ''F'' and ''G'' are called ''naturally isomorphic'' or simply ''isomorphic'' if there exists a natural isomorphism from ''F'' to ''G''.

An '''infranatural transformation''' η from ''F'' to ''G'' is simply a family of morphisms {{nobreak|1=η''X'': ''F''(''X'') → ''G''(''X'')}}. Thus a natural transformation is an infranatural transformation for which {{nobreak|1=η''Y'' ∘ ''F''(''f'') = ''G''(''f'') ∘ η''X''}} for every morphism {{nobreak|1=''f'' : ''X'' → ''Y''}}. The '''naturalizer''' of η, nat(η), is the largest [[subcategory]] of ''C'' containing all the objects of ''C'' on which η restricts to a natural transformation.

==Examples==
===Opposite group===
{{details|Opposite group}}
Statements such as
:"Every group is naturally isomorphic to its [[opposite group]]"
abound in modern mathematics. We will now give the precise meaning of this statement as well as its proof. Consider the category '''Grp''' of all [[group (mathematics)|group]]s with [[group homomorphism]]s as morphisms. If (''G'',*) is a group, we define its opposite group (''G''op,*op) as follows: ''G''op is the same set as ''G'', and the operation *op is defined by {{nobreak|1=''a'' *op ''b'' = ''b'' * ''a''}}. All multiplications in ''G''op are thus "turned around". Forming the [[Opposite category|opposite]] group becomes a (covariant!) functor from '''Grp''' to '''Grp''' if we define {{nobreak|1=''f''op = ''f''}} for any group homomorphism {{nobreak|1=''f'': ''G'' → ''H''}}. Note that ''f''op is indeed a group homomorphism from ''G''op to ''H''op:
:''f''op(''a'' *op ''b'') = ''f''(''b'' * ''a'') = ''f''(''b'') * ''f''(''a'') = ''f''op(''a'') *op ''f''op(''b'').
The content of the above statement is:
:"The identity functor {{nobreak|1=Id'''Grp''' : '''Grp''' → '''Grp'''}} is naturally isomorphic to the opposite functor {{nobreak|1=op : '''Grp''' → '''Grp'''}}."
To prove this, we need to provide isomorphisms {{nobreak|1=η''G'' : ''G'' → ''G''op}} for every group ''G'', such that the above diagram commutes. Set {{nobreak|1=η''G''(''a'') = ''a''−1}}. The formulas {{nobreak|1=(''ab'')−1 = ''b''−1 ''a''−1}} and {{nobreak|1=(''a''−1)−1 = ''a''}} show that η''G'' is a group homomorphism which is its own inverse. To prove the naturality, we start with a group homomorphism {{nobreak|1=''f'' : ''G'' → ''H''}} and show {{nobreak|1=η''H'' ∘ ''f'' = ''f''op ∘ η''G''}}, i.e. {{nobreak|1=(''f''(''a''))−1 = ''f''op(''a''−1)}} for all ''a'' in ''G''. This is true since {{nobreak|1=''f''op = ''f''}} and every group homomorphism has the property {{nobreak|1=(''f''(''a''))−1 = ''f''(''a''−1)}}.

===Double dual of a finite dimensional vector space===
If ''K'' is a [[field (mathematics)|field]], then for every [[vector space]] ''V'' over ''K'' we have a "natural" [[injective]] [[linear map]] {{nobreak|1=''V'' → ''V''**}} from the vector space into its [[double dual]]. These maps are "natural" in the following sense: the double dual operation is a functor, and the maps are the components of a natural transformation from the identity functor to the double dual functor.

===Counterexample: dual of a finite-dimensional vector space===
Every finite-dimensional vector space is isomorphic to its dual space, but this isomorphism relies on an arbitrary choice of isomorphism (for example, via choosing a basis and then taking the isomorphism sending this basis to the corresponding [[dual basis]]). There is in general no natural isomorphism between a finite-dimensional vector space and its dual space.<ref>{{harv|MacLane|Birkhoff|1999|loc=§VI.4}}</ref> However, related categories (with additional structure and restrictions on the maps) do have a natural isomorphism, as described below.

The dual space of a finite-dimensional vector space is again a finite-dimensional vector space of the same dimension, and these are thus isomorphic, since dimension is the only invariant of finite-dimensional vector spaces over a given field. However, in the absence of additional data (such as a basis), there is no given map from a space to its dual, and thus such an isomorphism requires a choice, and is "not natural". On the category of finite-dimensional vector spaces and linear maps, one can define an infranatural isomorphism from vector spaces to their dual by choosing an isomorphism for each space (say, by choosing a basis for every vector space and taking the corresponding isomorphism), but this will not define a natural transformation. Intuitively this is because it required a choice, rigorously because ''any'' such choice of isomorphisms will not commute with ''all'' linear maps; see {{harv|MacLane|Birkhoff|1999|loc=§VI.4}} for detailed discussion.

Starting from finite-dimensional vector spaces (as objects) and the dual functor, one can define a natural isomorphism, but this requires first adding additional structure, then restricting the maps from "all linear maps" to "linear maps that respect this structure". Explicitly, for each vector space, require that it come with the data of an isomorphism to its dual, <math>\eta_V\colon V \to V^*.</math> In other words, take as objects vector spaces with a [[nondegenerate bilinear form]] <math>b_V\colon V \times V \to K.</math> This defines an infranatural isomorphism (isomorphism for each object). One then restricts the maps to only those maps that commute with these isomorphism (restricts to the naturalizer of ''η''), in other words, restrict to the maps that do not change the bilinear form: <math>b(T(v),T(w))=b(v,w).</math> The resulting category, with objects finite-dimensional vector spaces with a nondegenerate bilinear form, and maps linear transforms that respect the bilinear form, by construction has a natural isomorphism from the identity to the dual (each space has an isomorphism to its dual, and the maps in the category are required to commute). Viewed in this light, this construction (add transforms for each object, restrict maps to commute with these) is completely general, and does not depend on any particular properties of vector spaces.

In this category (finite-dimensional vector spaces with a nondegenerate bilinear form, maps linear transforms that respect the bilinear form), the dual of a map between vector spaces can be identified as a [[transpose]]. Often for reasons of geometric interest this is specialized to a subcategory, by requiring that the nondegenerate bilinear forms have additional properties, such as being symmetric ([[orthogonal matrices]]), symmetric and positive definite ([[inner product space]]), symmetric sesquilinear ([[Hermitian space]]s), skew-symmetric and totally isotropic ([[symplectic vector space]]), etc. – in all these categories a vector space is naturally identified with its dual, by the nondegenerate bilinear form.

===Tensor-hom adjunction===
{{see|Tensor-hom adjunction|Adjoint functors}}
Consider the [[category of abelian groups|category '''Ab''' of abelian groups and group homomorphisms]]. For all abelian groups ''X'', ''Y'' and ''Z'' we have a group isomorphism
:{{nobreak|1=Hom(''X'' {{otimes}} ''Y'', ''Z'') → Hom(''X'', Hom(''Y'', ''Z''))}}.
These isomorphisms are "natural" in the sense that they define a natural transformation between the two involved functors {{nobreak|1='''Ab''' × '''Ab'''op × '''Ab'''op → '''Ab'''}}.

This is formally the [[tensor-hom adjunction]], and is an archetypal example of a pair of [[adjoint functors]]. Natural transformations arise frequently in conjunction with adjoint functors, and indeed, adjoint functors are defined by a certain natural isomorphism. Additionally, every pair of adjoint functors comes equipped with two natural transformations (generally not isomorphisms) called the ''unit'' and ''counit''.

== Unnatural isomorphism ==
The notion of a natural transformation is categorical, and states (informally) that a particular map between functors can be done consistently over an entire category. Informally, a particular map (esp. an isomorphism) between individual objects (not entire categories) is referred to as a "natural isomorphism", meaning implicitly that it is actually defined on the entire category, and defines a natural transformation of functors; formalizing this intuition was a motivating factor in the development of category theory. Conversely, a particular map between particular objects may be called an '''unnatural isomorphism''' (or "this isomorphism is not natural") if the map cannot be extended to a natural transformation on the entire category. Given an object ''X,'' a functor ''G'' (taking for simplicity the first functor to be the identity) and an isomorphism <math>\eta\colon X \to G(X),</math> proof of unnaturality is most easily shown by giving an automorphism <math>A\colon X \to X</math> that does not commute with this isomorphism (so <math>\eta \circ A \neq G(A) \circ \eta</math>). More strongly, if one wishes to prove that ''X'' and ''G''(''X'') are not naturally isomorphic, without reference to a particular isomorphism, this requires showing that for ''any'' isomorphism ''η,'' there is some ''A'' with which it does not commute; in some cases a single automorphism ''A'' works for all candidate isomorphisms ''η,'' while in other cases one must show how to construct a different ''A''''η'' for each isomorphism. The maps of the category play a crucial role – any infranatural transform is natural if the only maps are the identity map, for instance.

This is similar (but more categorical) to concepts in group theory or module theory, where a given decomposition of an object into a direct sum is "not natural", or rather "not unique", as automorphisms exist that do not preserve the direct sum decomposition – see [[Structure theorem for finitely generated modules over a principal ideal domain#Uniqueness]] for example.

Some authors distinguish notationally, using ≅ for a natural isomorphism and ≈ for an unnatural isomorphism, reserving = for equality (usually equality of maps).

== Operations with natural transformations ==
If {{nobreak|1=η : ''F'' → ''G''}} and {{nobreak|1=ε : ''G'' → ''H''}} are natural transformations between functors {{nobreak|1=''F'',''G'',''H'' : ''C'' → ''D''}}, then we can compose them to get a natural transformation {{nobreak|1=εη : ''F'' → ''H''}}. This is done componentwise: {{nobreak|1=(εη)''X'' = ε''X''η''X''}}. This "vertical composition" of natural transformation is [[associative]] and has an identity, and allows one to consider the collection of all functors {{nobreak|1=''C'' → ''D''}} itself as a category (see below under [[#Functor categories|Functor categories]]).

Natural transformations also have a "horizontal composition". If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}} and {{nobreak|1=ε : ''J'' → ''K''}} is a natural transformation between functors {{nobreak|1=''J'',''K'' : ''D'' → ''E''}}, then the composition of functors allows a composition of natural transformations {{nobreak|1=ηε : ''JF'' → ''KG''}}. This operation is also associative with identity, and the identity coincides with that for vertical composition. The two operations are related by an identity which exchanges vertical composition with horizontal composition.

If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}}, and {{nobreak|1=''H'' : ''D'' → ''E''}} is another functor, then we can form the natural transformation {{nobreak|1=''H''η : ''HF'' → ''HG''}} by defining

:<math> (H \eta)_X = H \eta_X. </math>

If on the other hand {{nobreak|1=''K'' : ''B'' → ''C''}} is a functor, the natural transformation {{nobreak|1=η''K'' : ''FK'' → ''GK''}} is defined by

:<math> (\eta K)_X = \eta_{K(X)}.\, </math>

==Functor categories==

{{Main|Functor category}}
If ''C'' is any category and ''I'' is a [[small category]], we can form the [[functor category]] ''CI'' having as objects all functors from ''I'' to ''C'' and as morphisms the natural transformations between those functors. This forms a category since for any functor ''F'' there is an identity natural transformation {{nobreak|1=1''F'' : ''F'' → ''F''}} (which assigns to every object ''X'' the identity morphism on ''F''(''X'')) and the composition of two natural transformations (the "vertical composition" above) is again a natural transformation.

The [[isomorphism]]s in ''CI'' are precisely the natural isomorphisms. That is, a natural transformation {{nobreak|1=η : ''F'' → ''G''}} is a natural isomorphism if and only if there exists a natural transformation {{nobreak|1=ε : ''G'' → ''F''}} such that {{nobreak|1=ηε = 1''G''}} and {{nobreak|1=εη = 1''F''}}.

The functor category ''CI'' is especially useful if ''I'' arises from a [[directed graph]]. For instance, if ''I'' is the category of the directed graph {{nobreak|1=• → •}}, then ''CI'' has as objects the morphisms of ''C'', and a morphism between {{nobreak|1=φ : ''U'' → ''V''}} and {{nobreak|1=ψ : ''X'' → ''Y''}} in ''CI'' is a pair of morphisms {{nobreak|1=''f'' : ''U'' → ''X''}} and {{nobreak|1=''g'' : ''V'' → ''Y''}} in ''C'' such that the "square commutes", i.e. {{nobreak|1=ψ ''f'' = ''g'' φ}}.

More generally, one can build the [[2-category]] '''Cat''' whose
* 0-cells (objects) are the small categories,
* 1-cells (arrows) between two objects <math>C</math> and <math>D</math> are the functors from <math>C</math> to <math>D</math>,
* 2-cells between two 1-cells (functors) <math>F:C\to D</math> and <math>G:C\to D</math> are the natural transformations from <math>F</math> to <math>G</math>.
The horizontal and vertical compositions are the compositions between natural transformations described previously. A functor category <math>C^I</math> is then simply a hom-category in this category (smallness issues aside).

==Yoneda lemma==

{{Main|Yoneda lemma}}
If ''X'' is an object of a [[locally small category]] ''C'', then the assignment {{nobreak|1=''Y'' {{mapsto}} Hom''C''(''X'', ''Y'')}} defines a covariant functor {{nobreak|1=''F''''X'' : ''C'' → '''Set'''}}. This functor is called ''[[representable functor|representable]]'' (more generally, a representable functor is any functor naturally isomorphic to this functor for an appropriate choice of ''X''). The natural transformations from a representable functor to an arbitrary functor {{nobreak|1=''F'' : ''C'' → '''Set'''}} are completely known and easy to describe; this is the content of the [[Yoneda lemma]].

== Historical notes ==
{{Unreferenced section|date=October 2008}}
[[Saunders Mac Lane]], one of the founders of category theory, is said to have remarked, "I didn't invent categories to study functors; I invented them to study natural transformations."<ref>{{harv|MacLane|1998}}</ref> Just as the study of [[group (mathematics)|groups]] is not complete without a study of [[group homomorphism|homomorphisms]], so the study of categories is not complete without the study of [[functor]]s. The reason for Mac Lane's comment is that the study of functors is itself not complete without the study of natural transformations.

The context of Mac Lane's remark was the axiomatic theory of [[homology (mathematics)|homology]]. Different ways of constructing homology could be shown to coincide: for example in the case of a [[simplicial complex]] the groups defined directly would be isomorphic to those of the singular theory. What cannot easily be expressed without the language of natural transformations is how homology groups are compatible with morphisms between objects, and how two equivalent homology theories not only have the same homology groups, but also the same morphisms between those groups.

==Symbols used==
* {{unichar|2297|CIRCLED TIMES|html=}}

== See also ==
* [[Extranatural transformation]]

== References ==
{{Portal|Category theory}}
{{reflist}}
{{refbegin}}
*{{citation| first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | series = Graduate Texts in Mathematics '''5''' | edition = 2nd | publisher = Springer-Verlag | isbn = 0-387-98403-8}}
* {{citation|first1=Saunders|last1=MacLane|authorlink1=Saunders MacLane|first2=Garrett|last2=Birkhoff|authorlink2=Garrett Birkhoff|title=Algebra|edition=3rd|publisher=AMS Chelsea Publishing|year=1999|isbn=0-8218-1646-2}}.
{{refend}}

==External links==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* [[André Joyal]], [http://ncatlab.org/nlab CatLab], a wiki project dedicated to the exposition of categorical mathematics
* {{cite web | first = Chris | last = Hillman | title = A Categorical Primer | id = {{citeseerx|10.1.1.24.3264}} | postscript = : }} formal introduction to category theory.
* J. Adamek, H. Herrlich, G. Stecker, [http://katmat.math.uni-bremen.de/acc/acc.pdf Abstract and Concrete Categories-The Joy of Cats]
* [[Stanford Encyclopedia of Philosophy]]: "[http://plato.stanford.edu/entries/category-theory/ Category Theory]" -- by Jean-Pierre Marquis. Extensive bibliography.
* [http://www.mta.ca/~cat-dist/ List of academic conferences on category theory]
* Baez, John, 1996,"[http://math.ucr.edu/home/baez/week73.html The Tale of ''n''-categories.]" An informal introduction to higher order categories.
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Functors]]

[[es:Transformación natural]]
[[fr:Transformation naturelle]]
[[it:Trasformazione naturale]]
[[nl:Natuurlijke transformatie]]
[[ja:自然変換]]
[[pl:Transformacja naturalna]]
[[ru:Естественное преобразование]]
[[sv:Naturlig transformation]]

Natural transformation

2012-11-11T23:40:09Z

Magmalex: /* References */

{{About|natural transformations in category theory|the natural competence of bacteria to take up foreign DNA|Transformation (genetics)}}
{{other uses|Transformation (mathematics) (disambiguation)}}
In [[category theory]], a branch of [[mathematics]], a '''natural transformation''' provides a way of transforming one [[functor]] into another while respecting the internal structure (i.e. the composition of [[morphism]]s) of the categories involved. Hence, a natural transformation can be considered to be a "morphism of functors". Indeed this intuition can be formalized to define so-called [[functor category|functor categories]]. Natural transformations are, after categories and functors, one of the most basic notions of [[category theory]] and consequently appear in the majority of its applications.

==Definition==
If ''F'' and ''G'' are [[functor]]s between the categories ''C'' and ''D'', then a '''natural transformation''' η from ''F'' to ''G'' associates to every object ''X'' in ''C'' a [[morphism]] {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} between objects of ''D'', called the '''component''' of η at ''X'', such that for every morphism {{nobreak|1=''f'' : ''X'' → ''Y'' in ''C''}} we have:

:<math>\eta_Y \circ F(f) = G(f) \circ \eta_X</math>

This equation can conveniently be expressed by the [[commutative diagram]]

[[File:Natural transformation.svg|175px]]

If both ''F'' and ''G'' are [[contravariant functor|contravariant]], the horizontal arrows in this diagram are reversed. If η is a natural transformation from ''F'' to ''G'', we also write {{nobreak|1=η : ''F'' → ''G''}} or {{nobreak|1=η : ''F'' ⇒ ''G''}}. This is also expressed by saying the family of morphisms {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} is '''natural''' in ''X''.

If, for every object ''X'' in ''C'', the morphism η''X'' is an [[isomorphism]] in ''D'', then η is said to be a '''{{visible anchor|natural isomorphism}}''' (or sometimes '''natural equivalence''' or '''isomorphism of functors'''). Two functors ''F'' and ''G'' are called ''naturally isomorphic'' or simply ''isomorphic'' if there exists a natural isomorphism from ''F'' to ''G''.

An '''infranatural transformation''' η from ''F'' to ''G'' is simply a family of morphisms {{nobreak|1=η''X'': ''F''(''X'') → ''G''(''X'')}}. Thus a natural transformation is an infranatural transformation for which {{nobreak|1=η''Y'' ∘ ''F''(''f'') = ''G''(''f'') ∘ η''X''}} for every morphism {{nobreak|1=''f'' : ''X'' → ''Y''}}. The '''naturalizer''' of η, nat(η), is the largest [[subcategory]] of ''C'' containing all the objects of ''C'' on which η restricts to a natural transformation.

==Examples==
===Opposite group===
{{details|Opposite group}}
Statements such as
:"Every group is naturally isomorphic to its [[opposite group]]"
abound in modern mathematics. We will now give the precise meaning of this statement as well as its proof. Consider the category '''Grp''' of all [[group (mathematics)|group]]s with [[group homomorphism]]s as morphisms. If (''G'',*) is a group, we define its opposite group (''G''op,*op) as follows: ''G''op is the same set as ''G'', and the operation *op is defined by {{nobreak|1=''a'' *op ''b'' = ''b'' * ''a''}}. All multiplications in ''G''op are thus "turned around". Forming the [[Opposite category|opposite]] group becomes a (covariant!) functor from '''Grp''' to '''Grp''' if we define {{nobreak|1=''f''op = ''f''}} for any group homomorphism {{nobreak|1=''f'': ''G'' → ''H''}}. Note that ''f''op is indeed a group homomorphism from ''G''op to ''H''op:
:''f''op(''a'' *op ''b'') = ''f''(''b'' * ''a'') = ''f''(''b'') * ''f''(''a'') = ''f''op(''a'') *op ''f''op(''b'').
The content of the above statement is:
:"The identity functor {{nobreak|1=Id'''Grp''' : '''Grp''' → '''Grp'''}} is naturally isomorphic to the opposite functor {{nobreak|1=op : '''Grp''' → '''Grp'''}}."
To prove this, we need to provide isomorphisms {{nobreak|1=η''G'' : ''G'' → ''G''op}} for every group ''G'', such that the above diagram commutes. Set {{nobreak|1=η''G''(''a'') = ''a''−1}}. The formulas {{nobreak|1=(''ab'')−1 = ''b''−1 ''a''−1}} and {{nobreak|1=(''a''−1)−1 = ''a''}} show that η''G'' is a group homomorphism which is its own inverse. To prove the naturality, we start with a group homomorphism {{nobreak|1=''f'' : ''G'' → ''H''}} and show {{nobreak|1=η''H'' ∘ ''f'' = ''f''op ∘ η''G''}}, i.e. {{nobreak|1=(''f''(''a''))−1 = ''f''op(''a''−1)}} for all ''a'' in ''G''. This is true since {{nobreak|1=''f''op = ''f''}} and every group homomorphism has the property {{nobreak|1=(''f''(''a''))−1 = ''f''(''a''−1)}}.

===Double dual of a finite dimensional vector space===
If ''K'' is a [[field (mathematics)|field]], then for every [[vector space]] ''V'' over ''K'' we have a "natural" [[injective]] [[linear map]] {{nobreak|1=''V'' → ''V''**}} from the vector space into its [[double dual]]. These maps are "natural" in the following sense: the double dual operation is a functor, and the maps are the components of a natural transformation from the identity functor to the double dual functor.

===Counterexample: dual of a finite-dimensional vector space===
Every finite-dimensional vector space is isomorphic to its dual space, but this isomorphism relies on an arbitrary choice of isomorphism (for example, via choosing a basis and then taking the isomorphism sending this basis to the corresponding [[dual basis]]). There is in general no natural isomorphism between a finite-dimensional vector space and its dual space.<ref>{{harv|MacLane|Birkhoff|1999|loc=§VI.4}}</ref> However, related categories (with additional structure and restrictions on the maps) do have a natural isomorphism, as described below.

The dual space of a finite-dimensional vector space is again a finite-dimensional vector space of the same dimension, and these are thus isomorphic, since dimension is the only invariant of finite-dimensional vector spaces over a given field. However, in the absence of additional data (such as a basis), there is no given map from a space to its dual, and thus such an isomorphism requires a choice, and is "not natural". On the category of finite-dimensional vector spaces and linear maps, one can define an infranatural isomorphism from vector spaces to their dual by choosing an isomorphism for each space (say, by choosing a basis for every vector space and taking the corresponding isomorphism), but this will not define a natural transformation. Intuitively this is because it required a choice, rigorously because ''any'' such choice of isomorphisms will not commute with ''all'' linear maps; see {{harv|MacLane|Birkhoff|1999|loc=§VI.4}} for detailed discussion.

Starting from finite-dimensional vector spaces (as objects) and the dual functor, one can define a natural isomorphism, but this requires first adding additional structure, then restricting the maps from "all linear maps" to "linear maps that respect this structure". Explicitly, for each vector space, require that it come with the data of an isomorphism to its dual, <math>\eta_V\colon V \to V^*.</math> In other words, take as objects vector spaces with a [[nondegenerate bilinear form]] <math>b_V\colon V \times V \to K.</math> This defines an infranatural isomorphism (isomorphism for each object). One then restricts the maps to only those maps that commute with these isomorphism (restricts to the naturalizer of ''η''), in other words, restrict to the maps that do not change the bilinear form: <math>b(T(v),T(w))=b(v,w).</math> The resulting category, with objects finite-dimensional vector spaces with a nondegenerate bilinear form, and maps linear transforms that respect the bilinear form, by construction has a natural isomorphism from the identity to the dual (each space has an isomorphism to its dual, and the maps in the category are required to commute). Viewed in this light, this construction (add transforms for each object, restrict maps to commute with these) is completely general, and does not depend on any particular properties of vector spaces.

In this category (finite-dimensional vector spaces with a nondegenerate bilinear form, maps linear transforms that respect the bilinear form), the dual of a map between vector spaces can be identified as a [[transpose]]. Often for reasons of geometric interest this is specialized to a subcategory, by requiring that the nondegenerate bilinear forms have additional properties, such as being symmetric ([[orthogonal matrices]]), symmetric and positive definite ([[inner product space]]), symmetric sesquilinear ([[Hermitian space]]s), skew-symmetric and totally isotropic ([[symplectic vector space]]), etc. – in all these categories a vector space is naturally identified with its dual, by the nondegenerate bilinear form.

===Tensor-hom adjunction===
{{see|Tensor-hom adjunction|Adjoint functors}}
Consider the [[category of abelian groups|category '''Ab''' of abelian groups and group homomorphisms]]. For all abelian groups ''X'', ''Y'' and ''Z'' we have a group isomorphism
:{{nobreak|1=Hom(''X'' {{otimes}} ''Y'', ''Z'') → Hom(''X'', Hom(''Y'', ''Z''))}}.
These isomorphisms are "natural" in the sense that they define a natural transformation between the two involved functors {{nobreak|1='''Ab''' × '''Ab'''op × '''Ab'''op → '''Ab'''}}.

This is formally the [[tensor-hom adjunction]], and is an archetypal example of a pair of [[adjoint functors]]. Natural transformations arise frequently in conjunction with adjoint functors, and indeed, adjoint functors are defined by a certain natural isomorphism. Additionally, every pair of adjoint functors comes equipped with two natural transformations (generally not isomorphisms) called the ''unit'' and ''counit''.

== Unnatural isomorphism ==
The notion of a natural transformation is categorical, and states (informally) that a particular map between functors can be done consistently over an entire category. Informally, a particular map (esp. an isomorphism) between individual objects (not entire categories) is referred to as a "natural isomorphism", meaning implicitly that it is actually defined on the entire category, and defines a natural transformation of functors; formalizing this intuition was a motivating factor in the development of category theory. Conversely, a particular map between particular objects may be called an '''unnatural isomorphism''' (or "this isomorphism is not natural") if the map cannot be extended to a natural transformation on the entire category. Given an object ''X,'' a functor ''G'' (taking for simplicity the first functor to be the identity) and an isomorphism <math>\eta\colon X \to G(X),</math> proof of unnaturality is most easily shown by giving an automorphism <math>A\colon X \to X</math> that does not commute with this isomorphism (so <math>\eta \circ A \neq G(A) \circ \eta</math>). More strongly, if one wishes to prove that ''X'' and ''G''(''X'') are not naturally isomorphic, without reference to a particular isomorphism, this requires showing that for ''any'' isomorphism ''η,'' there is some ''A'' with which it does not commute; in some cases a single automorphism ''A'' works for all candidate isomorphisms ''η,'' while in other cases one must show how to construct a different ''A''''η'' for each isomorphism. The maps of the category play a crucial role – any infranatural transform is natural if the only maps are the identity map, for instance.

This is similar (but more categorical) to concepts in group theory or module theory, where a given decomposition of an object into a direct sum is "not natural", or rather "not unique", as automorphisms exist that do not preserve the direct sum decomposition – see [[Structure theorem for finitely generated modules over a principal ideal domain#Uniqueness]] for example.

Some authors distinguish notationally, using ≅ for a natural isomorphism and ≈ for an unnatural isomorphism, reserving = for equality (usually equality of maps).

== Operations with natural transformations ==
If {{nobreak|1=η : ''F'' → ''G''}} and {{nobreak|1=ε : ''G'' → ''H''}} are natural transformations between functors {{nobreak|1=''F'',''G'',''H'' : ''C'' → ''D''}}, then we can compose them to get a natural transformation {{nobreak|1=εη : ''F'' → ''H''}}. This is done componentwise: {{nobreak|1=(εη)''X'' = ε''X''η''X''}}. This "vertical composition" of natural transformation is [[associative]] and has an identity, and allows one to consider the collection of all functors {{nobreak|1=''C'' → ''D''}} itself as a category (see below under [[#Functor categories|Functor categories]]).

Natural transformations also have a "horizontal composition". If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}} and {{nobreak|1=ε : ''J'' → ''K''}} is a natural transformation between functors {{nobreak|1=''J'',''K'' : ''D'' → ''E''}}, then the composition of functors allows a composition of natural transformations {{nobreak|1=ηε : ''JF'' → ''KG''}}. This operation is also associative with identity, and the identity coincides with that for vertical composition. The two operations are related by an identity which exchanges vertical composition with horizontal composition.

If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}}, and {{nobreak|1=''H'' : ''D'' → ''E''}} is another functor, then we can form the natural transformation {{nobreak|1=''H''η : ''HF'' → ''HG''}} by defining

:<math> (H \eta)_X = H \eta_X. </math>

If on the other hand {{nobreak|1=''K'' : ''B'' → ''C''}} is a functor, the natural transformation {{nobreak|1=η''K'' : ''FK'' → ''GK''}} is defined by

:<math> (\eta K)_X = \eta_{K(X)}.\, </math>

==Functor categories==

{{Main|Functor category}}
If ''C'' is any category and ''I'' is a [[small category]], we can form the [[functor category]] ''CI'' having as objects all functors from ''I'' to ''C'' and as morphisms the natural transformations between those functors. This forms a category since for any functor ''F'' there is an identity natural transformation {{nobreak|1=1''F'' : ''F'' → ''F''}} (which assigns to every object ''X'' the identity morphism on ''F''(''X'')) and the composition of two natural transformations (the "vertical composition" above) is again a natural transformation.

The [[isomorphism]]s in ''CI'' are precisely the natural isomorphisms. That is, a natural transformation {{nobreak|1=η : ''F'' → ''G''}} is a natural isomorphism if and only if there exists a natural transformation {{nobreak|1=ε : ''G'' → ''F''}} such that {{nobreak|1=ηε = 1''G''}} and {{nobreak|1=εη = 1''F''}}.

The functor category ''CI'' is especially useful if ''I'' arises from a [[directed graph]]. For instance, if ''I'' is the category of the directed graph {{nobreak|1=• → •}}, then ''CI'' has as objects the morphisms of ''C'', and a morphism between {{nobreak|1=φ : ''U'' → ''V''}} and {{nobreak|1=ψ : ''X'' → ''Y''}} in ''CI'' is a pair of morphisms {{nobreak|1=''f'' : ''U'' → ''X''}} and {{nobreak|1=''g'' : ''V'' → ''Y''}} in ''C'' such that the "square commutes", i.e. {{nobreak|1=ψ ''f'' = ''g'' φ}}.

More generally, one can build the [[2-category]] '''Cat''' whose
* 0-cells (objects) are the small categories,
* 1-cells (arrows) between two objects <math>C</math> and <math>D</math> are the functors from <math>C</math> to <math>D</math>,
* 2-cells between two 1-cells (functors) <math>F:C\to D</math> and <math>G:C\to D</math> are the natural transformations from <math>F</math> to <math>G</math>.
The horizontal and vertical compositions are the compositions between natural transformations described previously. A functor category <math>C^I</math> is then simply a hom-category in this category (smallness issues aside).

==Yoneda lemma==

{{Main|Yoneda lemma}}
If ''X'' is an object of a [[locally small category]] ''C'', then the assignment {{nobreak|1=''Y'' {{mapsto}} Hom''C''(''X'', ''Y'')}} defines a covariant functor {{nobreak|1=''F''''X'' : ''C'' → '''Set'''}}. This functor is called ''[[representable functor|representable]]'' (more generally, a representable functor is any functor naturally isomorphic to this functor for an appropriate choice of ''X''). The natural transformations from a representable functor to an arbitrary functor {{nobreak|1=''F'' : ''C'' → '''Set'''}} are completely known and easy to describe; this is the content of the [[Yoneda lemma]].

== Historical notes ==
{{Unreferenced section|date=October 2008}}
[[Saunders Mac Lane]], one of the founders of category theory, is said to have remarked, "I didn't invent categories to study functors; I invented them to study natural transformations."<ref>{{harv|MacLane}}</ref> Just as the study of [[group (mathematics)|groups]] is not complete without a study of [[group homomorphism|homomorphisms]], so the study of categories is not complete without the study of [[functor]]s. The reason for Mac Lane's comment is that the study of functors is itself not complete without the study of natural transformations.

The context of Mac Lane's remark was the axiomatic theory of [[homology (mathematics)|homology]]. Different ways of constructing homology could be shown to coincide: for example in the case of a [[simplicial complex]] the groups defined directly would be isomorphic to those of the singular theory. What cannot easily be expressed without the language of natural transformations is how homology groups are compatible with morphisms between objects, and how two equivalent homology theories not only have the same homology groups, but also the same morphisms between those groups.

==Symbols used==
* {{unichar|2297|CIRCLED TIMES|html=}}

== See also ==
* [[Extranatural transformation]]

== References ==
{{Portal|Category theory}}
{{reflist}}
{{refbegin}}
*{{citation| first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | series = Graduate Texts in Mathematics '''5''' | edition = 2nd | publisher = Springer-Verlag | isbn = 0-387-98403-8}}
* {{citation|first1=Saunders|last1=MacLane|authorlink1=Saunders MacLane|first2=Garrett|last2=Birkhoff|authorlink2=Garrett Birkhoff|title=Algebra|edition=3rd|publisher=AMS Chelsea Publishing|year=1999|isbn=0-8218-1646-2}}.
{{refend}}

==External links==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* [[André Joyal]], [http://ncatlab.org/nlab CatLab], a wiki project dedicated to the exposition of categorical mathematics
* {{cite web | first = Chris | last = Hillman | title = A Categorical Primer | id = {{citeseerx|10.1.1.24.3264}} | postscript = : }} formal introduction to category theory.
* J. Adamek, H. Herrlich, G. Stecker, [http://katmat.math.uni-bremen.de/acc/acc.pdf Abstract and Concrete Categories-The Joy of Cats]
* [[Stanford Encyclopedia of Philosophy]]: "[http://plato.stanford.edu/entries/category-theory/ Category Theory]" -- by Jean-Pierre Marquis. Extensive bibliography.
* [http://www.mta.ca/~cat-dist/ List of academic conferences on category theory]
* Baez, John, 1996,"[http://math.ucr.edu/home/baez/week73.html The Tale of ''n''-categories.]" An informal introduction to higher order categories.
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Functors]]

[[es:Transformación natural]]
[[fr:Transformation naturelle]]
[[it:Trasformazione naturale]]
[[nl:Natuurlijke transformatie]]
[[ja:自然変換]]
[[pl:Transformacja naturalna]]
[[ru:Естественное преобразование]]
[[sv:Naturlig transformation]]

Natural transformation

2012-11-11T23:36:55Z

Magmalex: /* Historical notes */

{{About|natural transformations in category theory|the natural competence of bacteria to take up foreign DNA|Transformation (genetics)}}
{{other uses|Transformation (mathematics) (disambiguation)}}
In [[category theory]], a branch of [[mathematics]], a '''natural transformation''' provides a way of transforming one [[functor]] into another while respecting the internal structure (i.e. the composition of [[morphism]]s) of the categories involved. Hence, a natural transformation can be considered to be a "morphism of functors". Indeed this intuition can be formalized to define so-called [[functor category|functor categories]]. Natural transformations are, after categories and functors, one of the most basic notions of [[category theory]] and consequently appear in the majority of its applications.

==Definition==
If ''F'' and ''G'' are [[functor]]s between the categories ''C'' and ''D'', then a '''natural transformation''' η from ''F'' to ''G'' associates to every object ''X'' in ''C'' a [[morphism]] {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} between objects of ''D'', called the '''component''' of η at ''X'', such that for every morphism {{nobreak|1=''f'' : ''X'' → ''Y'' in ''C''}} we have:

:<math>\eta_Y \circ F(f) = G(f) \circ \eta_X</math>

This equation can conveniently be expressed by the [[commutative diagram]]

[[File:Natural transformation.svg|175px]]

If both ''F'' and ''G'' are [[contravariant functor|contravariant]], the horizontal arrows in this diagram are reversed. If η is a natural transformation from ''F'' to ''G'', we also write {{nobreak|1=η : ''F'' → ''G''}} or {{nobreak|1=η : ''F'' ⇒ ''G''}}. This is also expressed by saying the family of morphisms {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} is '''natural''' in ''X''.

If, for every object ''X'' in ''C'', the morphism η''X'' is an [[isomorphism]] in ''D'', then η is said to be a '''{{visible anchor|natural isomorphism}}''' (or sometimes '''natural equivalence''' or '''isomorphism of functors'''). Two functors ''F'' and ''G'' are called ''naturally isomorphic'' or simply ''isomorphic'' if there exists a natural isomorphism from ''F'' to ''G''.

An '''infranatural transformation''' η from ''F'' to ''G'' is simply a family of morphisms {{nobreak|1=η''X'': ''F''(''X'') → ''G''(''X'')}}. Thus a natural transformation is an infranatural transformation for which {{nobreak|1=η''Y'' ∘ ''F''(''f'') = ''G''(''f'') ∘ η''X''}} for every morphism {{nobreak|1=''f'' : ''X'' → ''Y''}}. The '''naturalizer''' of η, nat(η), is the largest [[subcategory]] of ''C'' containing all the objects of ''C'' on which η restricts to a natural transformation.

==Examples==
===Opposite group===
{{details|Opposite group}}
Statements such as
:"Every group is naturally isomorphic to its [[opposite group]]"
abound in modern mathematics. We will now give the precise meaning of this statement as well as its proof. Consider the category '''Grp''' of all [[group (mathematics)|group]]s with [[group homomorphism]]s as morphisms. If (''G'',*) is a group, we define its opposite group (''G''op,*op) as follows: ''G''op is the same set as ''G'', and the operation *op is defined by {{nobreak|1=''a'' *op ''b'' = ''b'' * ''a''}}. All multiplications in ''G''op are thus "turned around". Forming the [[Opposite category|opposite]] group becomes a (covariant!) functor from '''Grp''' to '''Grp''' if we define {{nobreak|1=''f''op = ''f''}} for any group homomorphism {{nobreak|1=''f'': ''G'' → ''H''}}. Note that ''f''op is indeed a group homomorphism from ''G''op to ''H''op:
:''f''op(''a'' *op ''b'') = ''f''(''b'' * ''a'') = ''f''(''b'') * ''f''(''a'') = ''f''op(''a'') *op ''f''op(''b'').
The content of the above statement is:
:"The identity functor {{nobreak|1=Id'''Grp''' : '''Grp''' → '''Grp'''}} is naturally isomorphic to the opposite functor {{nobreak|1=op : '''Grp''' → '''Grp'''}}."
To prove this, we need to provide isomorphisms {{nobreak|1=η''G'' : ''G'' → ''G''op}} for every group ''G'', such that the above diagram commutes. Set {{nobreak|1=η''G''(''a'') = ''a''−1}}. The formulas {{nobreak|1=(''ab'')−1 = ''b''−1 ''a''−1}} and {{nobreak|1=(''a''−1)−1 = ''a''}} show that η''G'' is a group homomorphism which is its own inverse. To prove the naturality, we start with a group homomorphism {{nobreak|1=''f'' : ''G'' → ''H''}} and show {{nobreak|1=η''H'' ∘ ''f'' = ''f''op ∘ η''G''}}, i.e. {{nobreak|1=(''f''(''a''))−1 = ''f''op(''a''−1)}} for all ''a'' in ''G''. This is true since {{nobreak|1=''f''op = ''f''}} and every group homomorphism has the property {{nobreak|1=(''f''(''a''))−1 = ''f''(''a''−1)}}.

===Double dual of a finite dimensional vector space===
If ''K'' is a [[field (mathematics)|field]], then for every [[vector space]] ''V'' over ''K'' we have a "natural" [[injective]] [[linear map]] {{nobreak|1=''V'' → ''V''**}} from the vector space into its [[double dual]]. These maps are "natural" in the following sense: the double dual operation is a functor, and the maps are the components of a natural transformation from the identity functor to the double dual functor.

===Counterexample: dual of a finite-dimensional vector space===
Every finite-dimensional vector space is isomorphic to its dual space, but this isomorphism relies on an arbitrary choice of isomorphism (for example, via choosing a basis and then taking the isomorphism sending this basis to the corresponding [[dual basis]]). There is in general no natural isomorphism between a finite-dimensional vector space and its dual space.<ref>{{harv|MacLane|Birkhoff|1999|loc=§VI.4}}</ref> However, related categories (with additional structure and restrictions on the maps) do have a natural isomorphism, as described below.

The dual space of a finite-dimensional vector space is again a finite-dimensional vector space of the same dimension, and these are thus isomorphic, since dimension is the only invariant of finite-dimensional vector spaces over a given field. However, in the absence of additional data (such as a basis), there is no given map from a space to its dual, and thus such an isomorphism requires a choice, and is "not natural". On the category of finite-dimensional vector spaces and linear maps, one can define an infranatural isomorphism from vector spaces to their dual by choosing an isomorphism for each space (say, by choosing a basis for every vector space and taking the corresponding isomorphism), but this will not define a natural transformation. Intuitively this is because it required a choice, rigorously because ''any'' such choice of isomorphisms will not commute with ''all'' linear maps; see {{harv|MacLane|Birkhoff|1999|loc=§VI.4}} for detailed discussion.

Starting from finite-dimensional vector spaces (as objects) and the dual functor, one can define a natural isomorphism, but this requires first adding additional structure, then restricting the maps from "all linear maps" to "linear maps that respect this structure". Explicitly, for each vector space, require that it come with the data of an isomorphism to its dual, <math>\eta_V\colon V \to V^*.</math> In other words, take as objects vector spaces with a [[nondegenerate bilinear form]] <math>b_V\colon V \times V \to K.</math> This defines an infranatural isomorphism (isomorphism for each object). One then restricts the maps to only those maps that commute with these isomorphism (restricts to the naturalizer of ''η''), in other words, restrict to the maps that do not change the bilinear form: <math>b(T(v),T(w))=b(v,w).</math> The resulting category, with objects finite-dimensional vector spaces with a nondegenerate bilinear form, and maps linear transforms that respect the bilinear form, by construction has a natural isomorphism from the identity to the dual (each space has an isomorphism to its dual, and the maps in the category are required to commute). Viewed in this light, this construction (add transforms for each object, restrict maps to commute with these) is completely general, and does not depend on any particular properties of vector spaces.

In this category (finite-dimensional vector spaces with a nondegenerate bilinear form, maps linear transforms that respect the bilinear form), the dual of a map between vector spaces can be identified as a [[transpose]]. Often for reasons of geometric interest this is specialized to a subcategory, by requiring that the nondegenerate bilinear forms have additional properties, such as being symmetric ([[orthogonal matrices]]), symmetric and positive definite ([[inner product space]]), symmetric sesquilinear ([[Hermitian space]]s), skew-symmetric and totally isotropic ([[symplectic vector space]]), etc. – in all these categories a vector space is naturally identified with its dual, by the nondegenerate bilinear form.

===Tensor-hom adjunction===
{{see|Tensor-hom adjunction|Adjoint functors}}
Consider the [[category of abelian groups|category '''Ab''' of abelian groups and group homomorphisms]]. For all abelian groups ''X'', ''Y'' and ''Z'' we have a group isomorphism
:{{nobreak|1=Hom(''X'' {{otimes}} ''Y'', ''Z'') → Hom(''X'', Hom(''Y'', ''Z''))}}.
These isomorphisms are "natural" in the sense that they define a natural transformation between the two involved functors {{nobreak|1='''Ab''' × '''Ab'''op × '''Ab'''op → '''Ab'''}}.

This is formally the [[tensor-hom adjunction]], and is an archetypal example of a pair of [[adjoint functors]]. Natural transformations arise frequently in conjunction with adjoint functors, and indeed, adjoint functors are defined by a certain natural isomorphism. Additionally, every pair of adjoint functors comes equipped with two natural transformations (generally not isomorphisms) called the ''unit'' and ''counit''.

== Unnatural isomorphism ==
The notion of a natural transformation is categorical, and states (informally) that a particular map between functors can be done consistently over an entire category. Informally, a particular map (esp. an isomorphism) between individual objects (not entire categories) is referred to as a "natural isomorphism", meaning implicitly that it is actually defined on the entire category, and defines a natural transformation of functors; formalizing this intuition was a motivating factor in the development of category theory. Conversely, a particular map between particular objects may be called an '''unnatural isomorphism''' (or "this isomorphism is not natural") if the map cannot be extended to a natural transformation on the entire category. Given an object ''X,'' a functor ''G'' (taking for simplicity the first functor to be the identity) and an isomorphism <math>\eta\colon X \to G(X),</math> proof of unnaturality is most easily shown by giving an automorphism <math>A\colon X \to X</math> that does not commute with this isomorphism (so <math>\eta \circ A \neq G(A) \circ \eta</math>). More strongly, if one wishes to prove that ''X'' and ''G''(''X'') are not naturally isomorphic, without reference to a particular isomorphism, this requires showing that for ''any'' isomorphism ''η,'' there is some ''A'' with which it does not commute; in some cases a single automorphism ''A'' works for all candidate isomorphisms ''η,'' while in other cases one must show how to construct a different ''A''''η'' for each isomorphism. The maps of the category play a crucial role – any infranatural transform is natural if the only maps are the identity map, for instance.

This is similar (but more categorical) to concepts in group theory or module theory, where a given decomposition of an object into a direct sum is "not natural", or rather "not unique", as automorphisms exist that do not preserve the direct sum decomposition – see [[Structure theorem for finitely generated modules over a principal ideal domain#Uniqueness]] for example.

Some authors distinguish notationally, using ≅ for a natural isomorphism and ≈ for an unnatural isomorphism, reserving = for equality (usually equality of maps).

== Operations with natural transformations ==
If {{nobreak|1=η : ''F'' → ''G''}} and {{nobreak|1=ε : ''G'' → ''H''}} are natural transformations between functors {{nobreak|1=''F'',''G'',''H'' : ''C'' → ''D''}}, then we can compose them to get a natural transformation {{nobreak|1=εη : ''F'' → ''H''}}. This is done componentwise: {{nobreak|1=(εη)''X'' = ε''X''η''X''}}. This "vertical composition" of natural transformation is [[associative]] and has an identity, and allows one to consider the collection of all functors {{nobreak|1=''C'' → ''D''}} itself as a category (see below under [[#Functor categories|Functor categories]]).

Natural transformations also have a "horizontal composition". If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}} and {{nobreak|1=ε : ''J'' → ''K''}} is a natural transformation between functors {{nobreak|1=''J'',''K'' : ''D'' → ''E''}}, then the composition of functors allows a composition of natural transformations {{nobreak|1=ηε : ''JF'' → ''KG''}}. This operation is also associative with identity, and the identity coincides with that for vertical composition. The two operations are related by an identity which exchanges vertical composition with horizontal composition.

If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}}, and {{nobreak|1=''H'' : ''D'' → ''E''}} is another functor, then we can form the natural transformation {{nobreak|1=''H''η : ''HF'' → ''HG''}} by defining

:<math> (H \eta)_X = H \eta_X. </math>

If on the other hand {{nobreak|1=''K'' : ''B'' → ''C''}} is a functor, the natural transformation {{nobreak|1=η''K'' : ''FK'' → ''GK''}} is defined by

:<math> (\eta K)_X = \eta_{K(X)}.\, </math>

==Functor categories==

{{Main|Functor category}}
If ''C'' is any category and ''I'' is a [[small category]], we can form the [[functor category]] ''CI'' having as objects all functors from ''I'' to ''C'' and as morphisms the natural transformations between those functors. This forms a category since for any functor ''F'' there is an identity natural transformation {{nobreak|1=1''F'' : ''F'' → ''F''}} (which assigns to every object ''X'' the identity morphism on ''F''(''X'')) and the composition of two natural transformations (the "vertical composition" above) is again a natural transformation.

The [[isomorphism]]s in ''CI'' are precisely the natural isomorphisms. That is, a natural transformation {{nobreak|1=η : ''F'' → ''G''}} is a natural isomorphism if and only if there exists a natural transformation {{nobreak|1=ε : ''G'' → ''F''}} such that {{nobreak|1=ηε = 1''G''}} and {{nobreak|1=εη = 1''F''}}.

The functor category ''CI'' is especially useful if ''I'' arises from a [[directed graph]]. For instance, if ''I'' is the category of the directed graph {{nobreak|1=• → •}}, then ''CI'' has as objects the morphisms of ''C'', and a morphism between {{nobreak|1=φ : ''U'' → ''V''}} and {{nobreak|1=ψ : ''X'' → ''Y''}} in ''CI'' is a pair of morphisms {{nobreak|1=''f'' : ''U'' → ''X''}} and {{nobreak|1=''g'' : ''V'' → ''Y''}} in ''C'' such that the "square commutes", i.e. {{nobreak|1=ψ ''f'' = ''g'' φ}}.

More generally, one can build the [[2-category]] '''Cat''' whose
* 0-cells (objects) are the small categories,
* 1-cells (arrows) between two objects <math>C</math> and <math>D</math> are the functors from <math>C</math> to <math>D</math>,
* 2-cells between two 1-cells (functors) <math>F:C\to D</math> and <math>G:C\to D</math> are the natural transformations from <math>F</math> to <math>G</math>.
The horizontal and vertical compositions are the compositions between natural transformations described previously. A functor category <math>C^I</math> is then simply a hom-category in this category (smallness issues aside).

==Yoneda lemma==

{{Main|Yoneda lemma}}
If ''X'' is an object of a [[locally small category]] ''C'', then the assignment {{nobreak|1=''Y'' {{mapsto}} Hom''C''(''X'', ''Y'')}} defines a covariant functor {{nobreak|1=''F''''X'' : ''C'' → '''Set'''}}. This functor is called ''[[representable functor|representable]]'' (more generally, a representable functor is any functor naturally isomorphic to this functor for an appropriate choice of ''X''). The natural transformations from a representable functor to an arbitrary functor {{nobreak|1=''F'' : ''C'' → '''Set'''}} are completely known and easy to describe; this is the content of the [[Yoneda lemma]].

== Historical notes ==
{{Unreferenced section|date=October 2008}}
[[Saunders Mac Lane]], one of the founders of category theory, is said to have remarked, "I didn't invent categories to study functors; I invented them to study natural transformations."<ref>{{harv|MacLane}}</ref> Just as the study of [[group (mathematics)|groups]] is not complete without a study of [[group homomorphism|homomorphisms]], so the study of categories is not complete without the study of [[functor]]s. The reason for Mac Lane's comment is that the study of functors is itself not complete without the study of natural transformations.

The context of Mac Lane's remark was the axiomatic theory of [[homology (mathematics)|homology]]. Different ways of constructing homology could be shown to coincide: for example in the case of a [[simplicial complex]] the groups defined directly would be isomorphic to those of the singular theory. What cannot easily be expressed without the language of natural transformations is how homology groups are compatible with morphisms between objects, and how two equivalent homology theories not only have the same homology groups, but also the same morphisms between those groups.

==Symbols used==
* {{unichar|2297|CIRCLED TIMES|html=}}

== See also ==
* [[Extranatural transformation]]

== References ==
{{Portal|Category theory}}
{{reflist}}
{{refbegin}}
*{{cite book | first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | series = Graduate Texts in Mathematics '''5''' | edition = 2nd | publisher = Springer-Verlag | isbn = 0-387-98403-8}}
* {{citation|first1=Saunders|last1=MacLane|authorlink1=Saunders MacLane|first2=Garrett|last2=Birkhoff|authorlink2=Garrett Birkhoff|title=Algebra|edition=3rd|publisher=AMS Chelsea Publishing|year=1999|isbn=0-8218-1646-2}}.
{{refend}}

==External links==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* [[André Joyal]], [http://ncatlab.org/nlab CatLab], a wiki project dedicated to the exposition of categorical mathematics
* {{cite web | first = Chris | last = Hillman | title = A Categorical Primer | id = {{citeseerx|10.1.1.24.3264}} | postscript = : }} formal introduction to category theory.
* J. Adamek, H. Herrlich, G. Stecker, [http://katmat.math.uni-bremen.de/acc/acc.pdf Abstract and Concrete Categories-The Joy of Cats]
* [[Stanford Encyclopedia of Philosophy]]: "[http://plato.stanford.edu/entries/category-theory/ Category Theory]" -- by Jean-Pierre Marquis. Extensive bibliography.
* [http://www.mta.ca/~cat-dist/ List of academic conferences on category theory]
* Baez, John, 1996,"[http://math.ucr.edu/home/baez/week73.html The Tale of ''n''-categories.]" An informal introduction to higher order categories.
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Functors]]

[[es:Transformación natural]]
[[fr:Transformation naturelle]]
[[it:Trasformazione naturale]]
[[nl:Natuurlijke transformatie]]
[[ja:自然変換]]
[[pl:Transformacja naturalna]]
[[ru:Естественное преобразование]]
[[sv:Naturlig transformation]]

Natural transformation

2012-11-11T23:33:22Z

Magmalex: /* Historical notes */ Inline reference

{{About|natural transformations in category theory|the natural competence of bacteria to take up foreign DNA|Transformation (genetics)}}
{{other uses|Transformation (mathematics) (disambiguation)}}
In [[category theory]], a branch of [[mathematics]], a '''natural transformation''' provides a way of transforming one [[functor]] into another while respecting the internal structure (i.e. the composition of [[morphism]]s) of the categories involved. Hence, a natural transformation can be considered to be a "morphism of functors". Indeed this intuition can be formalized to define so-called [[functor category|functor categories]]. Natural transformations are, after categories and functors, one of the most basic notions of [[category theory]] and consequently appear in the majority of its applications.

==Definition==
If ''F'' and ''G'' are [[functor]]s between the categories ''C'' and ''D'', then a '''natural transformation''' η from ''F'' to ''G'' associates to every object ''X'' in ''C'' a [[morphism]] {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} between objects of ''D'', called the '''component''' of η at ''X'', such that for every morphism {{nobreak|1=''f'' : ''X'' → ''Y'' in ''C''}} we have:

:<math>\eta_Y \circ F(f) = G(f) \circ \eta_X</math>

This equation can conveniently be expressed by the [[commutative diagram]]

[[File:Natural transformation.svg|175px]]

If both ''F'' and ''G'' are [[contravariant functor|contravariant]], the horizontal arrows in this diagram are reversed. If η is a natural transformation from ''F'' to ''G'', we also write {{nobreak|1=η : ''F'' → ''G''}} or {{nobreak|1=η : ''F'' ⇒ ''G''}}. This is also expressed by saying the family of morphisms {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} is '''natural''' in ''X''.

If, for every object ''X'' in ''C'', the morphism η''X'' is an [[isomorphism]] in ''D'', then η is said to be a '''{{visible anchor|natural isomorphism}}''' (or sometimes '''natural equivalence''' or '''isomorphism of functors'''). Two functors ''F'' and ''G'' are called ''naturally isomorphic'' or simply ''isomorphic'' if there exists a natural isomorphism from ''F'' to ''G''.

An '''infranatural transformation''' η from ''F'' to ''G'' is simply a family of morphisms {{nobreak|1=η''X'': ''F''(''X'') → ''G''(''X'')}}. Thus a natural transformation is an infranatural transformation for which {{nobreak|1=η''Y'' ∘ ''F''(''f'') = ''G''(''f'') ∘ η''X''}} for every morphism {{nobreak|1=''f'' : ''X'' → ''Y''}}. The '''naturalizer''' of η, nat(η), is the largest [[subcategory]] of ''C'' containing all the objects of ''C'' on which η restricts to a natural transformation.

==Examples==
===Opposite group===
{{details|Opposite group}}
Statements such as
:"Every group is naturally isomorphic to its [[opposite group]]"
abound in modern mathematics. We will now give the precise meaning of this statement as well as its proof. Consider the category '''Grp''' of all [[group (mathematics)|group]]s with [[group homomorphism]]s as morphisms. If (''G'',*) is a group, we define its opposite group (''G''op,*op) as follows: ''G''op is the same set as ''G'', and the operation *op is defined by {{nobreak|1=''a'' *op ''b'' = ''b'' * ''a''}}. All multiplications in ''G''op are thus "turned around". Forming the [[Opposite category|opposite]] group becomes a (covariant!) functor from '''Grp''' to '''Grp''' if we define {{nobreak|1=''f''op = ''f''}} for any group homomorphism {{nobreak|1=''f'': ''G'' → ''H''}}. Note that ''f''op is indeed a group homomorphism from ''G''op to ''H''op:
:''f''op(''a'' *op ''b'') = ''f''(''b'' * ''a'') = ''f''(''b'') * ''f''(''a'') = ''f''op(''a'') *op ''f''op(''b'').
The content of the above statement is:
:"The identity functor {{nobreak|1=Id'''Grp''' : '''Grp''' → '''Grp'''}} is naturally isomorphic to the opposite functor {{nobreak|1=op : '''Grp''' → '''Grp'''}}."
To prove this, we need to provide isomorphisms {{nobreak|1=η''G'' : ''G'' → ''G''op}} for every group ''G'', such that the above diagram commutes. Set {{nobreak|1=η''G''(''a'') = ''a''−1}}. The formulas {{nobreak|1=(''ab'')−1 = ''b''−1 ''a''−1}} and {{nobreak|1=(''a''−1)−1 = ''a''}} show that η''G'' is a group homomorphism which is its own inverse. To prove the naturality, we start with a group homomorphism {{nobreak|1=''f'' : ''G'' → ''H''}} and show {{nobreak|1=η''H'' ∘ ''f'' = ''f''op ∘ η''G''}}, i.e. {{nobreak|1=(''f''(''a''))−1 = ''f''op(''a''−1)}} for all ''a'' in ''G''. This is true since {{nobreak|1=''f''op = ''f''}} and every group homomorphism has the property {{nobreak|1=(''f''(''a''))−1 = ''f''(''a''−1)}}.

===Double dual of a finite dimensional vector space===
If ''K'' is a [[field (mathematics)|field]], then for every [[vector space]] ''V'' over ''K'' we have a "natural" [[injective]] [[linear map]] {{nobreak|1=''V'' → ''V''**}} from the vector space into its [[double dual]]. These maps are "natural" in the following sense: the double dual operation is a functor, and the maps are the components of a natural transformation from the identity functor to the double dual functor.

===Counterexample: dual of a finite-dimensional vector space===
Every finite-dimensional vector space is isomorphic to its dual space, but this isomorphism relies on an arbitrary choice of isomorphism (for example, via choosing a basis and then taking the isomorphism sending this basis to the corresponding [[dual basis]]). There is in general no natural isomorphism between a finite-dimensional vector space and its dual space.<ref>{{harv|MacLane|Birkhoff|1999|loc=§VI.4}}</ref> However, related categories (with additional structure and restrictions on the maps) do have a natural isomorphism, as described below.

The dual space of a finite-dimensional vector space is again a finite-dimensional vector space of the same dimension, and these are thus isomorphic, since dimension is the only invariant of finite-dimensional vector spaces over a given field. However, in the absence of additional data (such as a basis), there is no given map from a space to its dual, and thus such an isomorphism requires a choice, and is "not natural". On the category of finite-dimensional vector spaces and linear maps, one can define an infranatural isomorphism from vector spaces to their dual by choosing an isomorphism for each space (say, by choosing a basis for every vector space and taking the corresponding isomorphism), but this will not define a natural transformation. Intuitively this is because it required a choice, rigorously because ''any'' such choice of isomorphisms will not commute with ''all'' linear maps; see {{harv|MacLane|Birkhoff|1999|loc=§VI.4}} for detailed discussion.

Starting from finite-dimensional vector spaces (as objects) and the dual functor, one can define a natural isomorphism, but this requires first adding additional structure, then restricting the maps from "all linear maps" to "linear maps that respect this structure". Explicitly, for each vector space, require that it come with the data of an isomorphism to its dual, <math>\eta_V\colon V \to V^*.</math> In other words, take as objects vector spaces with a [[nondegenerate bilinear form]] <math>b_V\colon V \times V \to K.</math> This defines an infranatural isomorphism (isomorphism for each object). One then restricts the maps to only those maps that commute with these isomorphism (restricts to the naturalizer of ''η''), in other words, restrict to the maps that do not change the bilinear form: <math>b(T(v),T(w))=b(v,w).</math> The resulting category, with objects finite-dimensional vector spaces with a nondegenerate bilinear form, and maps linear transforms that respect the bilinear form, by construction has a natural isomorphism from the identity to the dual (each space has an isomorphism to its dual, and the maps in the category are required to commute). Viewed in this light, this construction (add transforms for each object, restrict maps to commute with these) is completely general, and does not depend on any particular properties of vector spaces.

In this category (finite-dimensional vector spaces with a nondegenerate bilinear form, maps linear transforms that respect the bilinear form), the dual of a map between vector spaces can be identified as a [[transpose]]. Often for reasons of geometric interest this is specialized to a subcategory, by requiring that the nondegenerate bilinear forms have additional properties, such as being symmetric ([[orthogonal matrices]]), symmetric and positive definite ([[inner product space]]), symmetric sesquilinear ([[Hermitian space]]s), skew-symmetric and totally isotropic ([[symplectic vector space]]), etc. – in all these categories a vector space is naturally identified with its dual, by the nondegenerate bilinear form.

===Tensor-hom adjunction===
{{see|Tensor-hom adjunction|Adjoint functors}}
Consider the [[category of abelian groups|category '''Ab''' of abelian groups and group homomorphisms]]. For all abelian groups ''X'', ''Y'' and ''Z'' we have a group isomorphism
:{{nobreak|1=Hom(''X'' {{otimes}} ''Y'', ''Z'') → Hom(''X'', Hom(''Y'', ''Z''))}}.
These isomorphisms are "natural" in the sense that they define a natural transformation between the two involved functors {{nobreak|1='''Ab''' × '''Ab'''op × '''Ab'''op → '''Ab'''}}.

This is formally the [[tensor-hom adjunction]], and is an archetypal example of a pair of [[adjoint functors]]. Natural transformations arise frequently in conjunction with adjoint functors, and indeed, adjoint functors are defined by a certain natural isomorphism. Additionally, every pair of adjoint functors comes equipped with two natural transformations (generally not isomorphisms) called the ''unit'' and ''counit''.

== Unnatural isomorphism ==
The notion of a natural transformation is categorical, and states (informally) that a particular map between functors can be done consistently over an entire category. Informally, a particular map (esp. an isomorphism) between individual objects (not entire categories) is referred to as a "natural isomorphism", meaning implicitly that it is actually defined on the entire category, and defines a natural transformation of functors; formalizing this intuition was a motivating factor in the development of category theory. Conversely, a particular map between particular objects may be called an '''unnatural isomorphism''' (or "this isomorphism is not natural") if the map cannot be extended to a natural transformation on the entire category. Given an object ''X,'' a functor ''G'' (taking for simplicity the first functor to be the identity) and an isomorphism <math>\eta\colon X \to G(X),</math> proof of unnaturality is most easily shown by giving an automorphism <math>A\colon X \to X</math> that does not commute with this isomorphism (so <math>\eta \circ A \neq G(A) \circ \eta</math>). More strongly, if one wishes to prove that ''X'' and ''G''(''X'') are not naturally isomorphic, without reference to a particular isomorphism, this requires showing that for ''any'' isomorphism ''η,'' there is some ''A'' with which it does not commute; in some cases a single automorphism ''A'' works for all candidate isomorphisms ''η,'' while in other cases one must show how to construct a different ''A''''η'' for each isomorphism. The maps of the category play a crucial role – any infranatural transform is natural if the only maps are the identity map, for instance.

This is similar (but more categorical) to concepts in group theory or module theory, where a given decomposition of an object into a direct sum is "not natural", or rather "not unique", as automorphisms exist that do not preserve the direct sum decomposition – see [[Structure theorem for finitely generated modules over a principal ideal domain#Uniqueness]] for example.

Some authors distinguish notationally, using ≅ for a natural isomorphism and ≈ for an unnatural isomorphism, reserving = for equality (usually equality of maps).

== Operations with natural transformations ==
If {{nobreak|1=η : ''F'' → ''G''}} and {{nobreak|1=ε : ''G'' → ''H''}} are natural transformations between functors {{nobreak|1=''F'',''G'',''H'' : ''C'' → ''D''}}, then we can compose them to get a natural transformation {{nobreak|1=εη : ''F'' → ''H''}}. This is done componentwise: {{nobreak|1=(εη)''X'' = ε''X''η''X''}}. This "vertical composition" of natural transformation is [[associative]] and has an identity, and allows one to consider the collection of all functors {{nobreak|1=''C'' → ''D''}} itself as a category (see below under [[#Functor categories|Functor categories]]).

Natural transformations also have a "horizontal composition". If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}} and {{nobreak|1=ε : ''J'' → ''K''}} is a natural transformation between functors {{nobreak|1=''J'',''K'' : ''D'' → ''E''}}, then the composition of functors allows a composition of natural transformations {{nobreak|1=ηε : ''JF'' → ''KG''}}. This operation is also associative with identity, and the identity coincides with that for vertical composition. The two operations are related by an identity which exchanges vertical composition with horizontal composition.

If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}}, and {{nobreak|1=''H'' : ''D'' → ''E''}} is another functor, then we can form the natural transformation {{nobreak|1=''H''η : ''HF'' → ''HG''}} by defining

:<math> (H \eta)_X = H \eta_X. </math>

If on the other hand {{nobreak|1=''K'' : ''B'' → ''C''}} is a functor, the natural transformation {{nobreak|1=η''K'' : ''FK'' → ''GK''}} is defined by

:<math> (\eta K)_X = \eta_{K(X)}.\, </math>

==Functor categories==

{{Main|Functor category}}
If ''C'' is any category and ''I'' is a [[small category]], we can form the [[functor category]] ''CI'' having as objects all functors from ''I'' to ''C'' and as morphisms the natural transformations between those functors. This forms a category since for any functor ''F'' there is an identity natural transformation {{nobreak|1=1''F'' : ''F'' → ''F''}} (which assigns to every object ''X'' the identity morphism on ''F''(''X'')) and the composition of two natural transformations (the "vertical composition" above) is again a natural transformation.

The [[isomorphism]]s in ''CI'' are precisely the natural isomorphisms. That is, a natural transformation {{nobreak|1=η : ''F'' → ''G''}} is a natural isomorphism if and only if there exists a natural transformation {{nobreak|1=ε : ''G'' → ''F''}} such that {{nobreak|1=ηε = 1''G''}} and {{nobreak|1=εη = 1''F''}}.

The functor category ''CI'' is especially useful if ''I'' arises from a [[directed graph]]. For instance, if ''I'' is the category of the directed graph {{nobreak|1=• → •}}, then ''CI'' has as objects the morphisms of ''C'', and a morphism between {{nobreak|1=φ : ''U'' → ''V''}} and {{nobreak|1=ψ : ''X'' → ''Y''}} in ''CI'' is a pair of morphisms {{nobreak|1=''f'' : ''U'' → ''X''}} and {{nobreak|1=''g'' : ''V'' → ''Y''}} in ''C'' such that the "square commutes", i.e. {{nobreak|1=ψ ''f'' = ''g'' φ}}.

More generally, one can build the [[2-category]] '''Cat''' whose
* 0-cells (objects) are the small categories,
* 1-cells (arrows) between two objects <math>C</math> and <math>D</math> are the functors from <math>C</math> to <math>D</math>,
* 2-cells between two 1-cells (functors) <math>F:C\to D</math> and <math>G:C\to D</math> are the natural transformations from <math>F</math> to <math>G</math>.
The horizontal and vertical compositions are the compositions between natural transformations described previously. A functor category <math>C^I</math> is then simply a hom-category in this category (smallness issues aside).

==Yoneda lemma==

{{Main|Yoneda lemma}}
If ''X'' is an object of a [[locally small category]] ''C'', then the assignment {{nobreak|1=''Y'' {{mapsto}} Hom''C''(''X'', ''Y'')}} defines a covariant functor {{nobreak|1=''F''''X'' : ''C'' → '''Set'''}}. This functor is called ''[[representable functor|representable]]'' (more generally, a representable functor is any functor naturally isomorphic to this functor for an appropriate choice of ''X''). The natural transformations from a representable functor to an arbitrary functor {{nobreak|1=''F'' : ''C'' → '''Set'''}} are completely known and easy to describe; this is the content of the [[Yoneda lemma]].

== Historical notes ==
{{Unreferenced section|date=October 2008}}
[[Saunders Mac Lane]], one of the founders of category theory, is said to have remarked, "I didn't invent categories to study functors; I invented them to study natural transformations."<ref>{{harv|MacLane|Birkhoff|1999|loc=§VI.4}}</ref> Just as the study of [[group (mathematics)|groups]] is not complete without a study of [[group homomorphism|homomorphisms]], so the study of categories is not complete without the study of [[functor]]s. The reason for Mac Lane's comment is that the study of functors is itself not complete without the study of natural transformations.

The context of Mac Lane's remark was the axiomatic theory of [[homology (mathematics)|homology]]. Different ways of constructing homology could be shown to coincide: for example in the case of a [[simplicial complex]] the groups defined directly would be isomorphic to those of the singular theory. What cannot easily be expressed without the language of natural transformations is how homology groups are compatible with morphisms between objects, and how two equivalent homology theories not only have the same homology groups, but also the same morphisms between those groups.

==Symbols used==
* {{unichar|2297|CIRCLED TIMES|html=}}

== See also ==
* [[Extranatural transformation]]

== References ==
{{Portal|Category theory}}
{{reflist}}
{{refbegin}}
*{{cite book | first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | series = Graduate Texts in Mathematics '''5''' | edition = 2nd | publisher = Springer-Verlag | isbn = 0-387-98403-8}}
* {{citation|first1=Saunders|last1=MacLane|authorlink1=Saunders MacLane|first2=Garrett|last2=Birkhoff|authorlink2=Garrett Birkhoff|title=Algebra|edition=3rd|publisher=AMS Chelsea Publishing|year=1999|isbn=0-8218-1646-2}}.
{{refend}}

==External links==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* [[André Joyal]], [http://ncatlab.org/nlab CatLab], a wiki project dedicated to the exposition of categorical mathematics
* {{cite web | first = Chris | last = Hillman | title = A Categorical Primer | id = {{citeseerx|10.1.1.24.3264}} | postscript = : }} formal introduction to category theory.
* J. Adamek, H. Herrlich, G. Stecker, [http://katmat.math.uni-bremen.de/acc/acc.pdf Abstract and Concrete Categories-The Joy of Cats]
* [[Stanford Encyclopedia of Philosophy]]: "[http://plato.stanford.edu/entries/category-theory/ Category Theory]" -- by Jean-Pierre Marquis. Extensive bibliography.
* [http://www.mta.ca/~cat-dist/ List of academic conferences on category theory]
* Baez, John, 1996,"[http://math.ucr.edu/home/baez/week73.html The Tale of ''n''-categories.]" An informal introduction to higher order categories.
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Functors]]

[[es:Transformación natural]]
[[fr:Transformation naturelle]]
[[it:Trasformazione naturale]]
[[nl:Natuurlijke transformatie]]
[[ja:自然変換]]
[[pl:Transformacja naturalna]]
[[ru:Естественное преобразование]]
[[sv:Naturlig transformation]]

Electrochemistry

2012-10-10T00:46:35Z

Magmalex: /* 19th century */

[[File:Faraday-Daniell.PNG|thumb|English chemist [[John Frederic Daniell|John Daniell]] ([[relative direction|left]]) and physicist [[Michael Faraday]] ([[relative direction|right]]), both credited as founders of electrochemistry today.]]

'''Electrochemistry''' is a branch of [[chemistry]] that studies [[chemical reaction]]s which take place in a [[solution]] at the interface of an electron [[Electrical conductor|conductor]] (a [[metal]] or a [[semiconductor]]) and an ionic conductor (the [[electrolyte]]), and which involve electron transfer between the electrode and the electrolyte or species in solution.

If a chemical reaction is driven by an external applied [[voltage]], as in [[electrolysis]], or if a voltage is created by a chemical reaction as in a [[battery (electricity)|battery]], it is an ''electrochemical'' reaction. In contrast, chemical reactions where electrons are transferred between [[molecule]]s are called oxidation/reduction ([[redox]]) reactions. In general, electrochemistry deals with situations where [[oxidation]] and [[redox|reduction]] reactions are separated in space or time, connected by an external electric circuit.

==History==
{{Main|History of electrochemistry}}

===16th to 18th century developments===
[[File:Guericke-electricaldevice.PNG|thumb|left|[[Germany|German]] [[physicist]] [[Otto von Guericke]] beside his electrical generator while conducting an experiment.]]
Understanding of electrical matters began in the sixteenth century. During this century the English scientist [[William Gilbert (astronomer)|William Gilbert]] spent 17 years experimenting with [[magnetism]] and, to a lesser extent, electricity. For his work on magnets, Gilbert became known as the ''"Father of Magnetism."'' He discovered various methods for producing and strengthening magnets.<ref>Richard P. Olenick, Tom M. Apostol, David L. Goodstein [http://books.google.com/books?id=Ht4T7C7AXZIC&pg=PA160 Beyond the mechanical universe: from electricity to modern physics], Cambridge University Press (1986) ISBN 0-521-30430-X, p. 160</ref>

In 1663 the [[Germany|German]] [[physicist]] [[Otto von Guericke]] created the first electric generator, which produced static electricity by applying friction in the machine. The generator was made of a large [[sulfur]] ball cast inside a glass globe, mounted on a shaft. The ball was rotated by means of a crank and a [[static electricity|static electric]] [[electric spark|spark]] was produced when a pad was rubbed against the ball as it rotated. The globe could be removed and used as source for experiments with electricity.<ref>R. Hellborg [http://books.google.com/books?id=tc6CEuIV1jEC&pg=PA52 Electrostatic accelerators: fundamentals and applications] (2005) ISBN 3540239839 p. 52</ref>

By the mid—18th century the [[France|French]] [[chemist]] [[Charles François de Cisternay du Fay]] had discovered two types of static electricity, and that like charges repel each other whilst unlike charges attract. Du Fay announced that electricity consisted of two fluids: ''"vitreous"'' (from the [[Latin]] for ''"glass"''), or positive, electricity; and ''"resinous,"'' or negative, electricity. This was the ''two-fluid theory'' of electricity, which was to be opposed by [[Benjamin Franklin|Benjamin Franklin's]] ''one-fluid theory'' later in the century.<ref>Steven Weinberg [http://books.google.com/books?id=cKXuMfnMC4IC&pg=PA15 The discovery of subatomic particles] Cambridge University Press (2003) ISBN 0-521-82351-X, p. 15</ref>[[File:Galvani-frog-legs.PNG|thumb|left|Late 1780s diagram of Galvani's experiment on frog legs.]]

[[Charles-Augustin de Coulomb]] developed the law of [[electrostatic]] attraction in 1785 as an outgrowth of his attempt to investigate the law of electrical repulsions as stated by [[Joseph Priestley]] in England.<ref>J. A. M. Bleeker, Johannes Geiss, M. Huber [http://books.google.com/books?id=NMk3adgqfawC&pg=PA227 The century of space science, Volume 1], Springer (2001) ISBN 0-7923-7196-8 p. 227</ref>
[[File:Volta-and-napoleon.PNG|thumb|right|[[Italy|Italian]] [[physicist]] [[Alessandro Volta]] showing his ''"[[Battery (electricity)|battery]]"'' to [[France|French]] [[emperor]] [[Napoleon I of France|Napoleon Bonaparte]] in the early 19th century.]]
In the late 18th century the [[Italy|Italian]] [[physician]] and [[anatomist]] [[Luigi Galvani]] marked the birth of electrochemistry by establishing a bridge between chemical reactions and electricity on his essay ''"De Viribus Electricitatis in Motu Musculari Commentarius"'' (Latin for Commentary on the Effect of Electricity on Muscular Motion) in 1791 where he proposed a ''"nerveo-electrical substance"'' on biological life forms.<ref name=g>John Robert Norris, Douglas W. Ribbons [http://books.google.com/books?id=TFfQPQKSc3EC&pg=PA248 Methods in microbiology, Volume 6], Academic Press (1972) ISBN 0-12-521546-0 p. 248</ref>

In his essay Galvani concluded that animal tissue contained a here-to-fore neglected innate, vital force, which he termed ''"animal electricity,"'' which activated [[nerve]]s and [[muscle]]s spanned by metal probes. He believed that this new force was a form of electricity in addition to the ''"natural"'' form produced by [[lightning]] or by the [[electric eel]] and [[Electric ray|torpedo ray]] as well as the ''"artificial"'' form produced by [[friction]] (i.e., static electricity).<ref name=g2>Frederick Collier Bakewell [http://books.google.com/books?id=Lks1AAAAMAAJ&pg=PA28 Electric science; its history, phenomena, and applications], Ingram, Cooke (1853) pp. 27–31</ref>

Galvani's scientific colleagues generally accepted his views, but [[Alessandro Volta]] rejected the idea of an ''"animal electric fluid,"'' replying that the frog's legs responded to differences in [[metal temper]], composition, and bulk.<ref name=g/><ref name=g2/> Galvani refuted this by obtaining muscular action with two pieces of the same material.

===19th century===
[[File:Humphrydavy.jpg|thumb|left|upright|Sir Humphry Davy's portrait in the 19th century.]]
In 1800, [[William Nicholson (chemist)|William Nicholson]] and [[Johann Wilhelm Ritter]] succeeded in decomposing water into [[hydrogen]] and [[oxygen]] by [[electrolysis]]. Soon thereafter Ritter discovered the process of [[electroplating]]. He also observed that the amount of metal deposited and the amount of oxygen produced during an electrolytic process depended on the distance between the [[electrode]]s.<ref name=lai/> By 1801 Ritter observed [[thermoelectricity|thermoelectric currents]] and anticipated the discovery of thermoelectricity by [[Thomas Johann Seebeck]].<ref>The New Encyclopaedia Britannica: Micropædia, Vol. 10 (1991) ISBN 0-85229-529-4, p. 90</ref>

By the 1810s [[William Hyde Wollaston]] made improvements to the [[galvanic cell]].
Sir [[Humphry Davy]]'s work with electrolysis led to the conclusion that the production of electricity in simple [[electrolytic cell]]s resulted from chemical action and that chemical combination occurred between substances of opposite charge. This work led directly to the isolation of [[sodium]] and [[potassium]] from their compounds and of the [[alkaline earth metal]]s from theirs in 1808.<ref>Charles Knight (ed.) [http://books.google.com/books?id=BchPAAAAMAAJ&pg=RA2-PT168 Biography: or, Third division of "The English encyclopedia", Volume 2], Bradbury, Evans & Co. (1867)</ref>

[[Hans Christian Ørsted]]'s discovery of the magnetic effect of electrical currents in 1820 was immediately recognized as an epoch-making advance, although he left further work on [[electromagnetism]] to others. [[André-Marie Ampère]] quickly repeated Ørsted's experiment, and formulated them mathematically.<ref>William Berkson [http://books.google.com/books?id=hMc9AAAAIAAJ&pg=PA34 Fields of force: the development of a world view from Faraday to Einstein], Routledge (1974) ISBN 0-7100-7626-6 pp. 34 ff</ref>

In 1821, Estonian-German [[physicist]] [[Thomas Johann Seebeck]] demonstrated the electrical potential in the juncture points of two dissimilar metals when there is a [[heat]] difference between the joints.<ref name=ohm>Brian Scott Baigrie [http://books.google.com/books?id=3XEc5xkWxi4C&pg=PA73 Electricity and magnetism: a historical perspective], Greenwood Publishing Group (2007) ISBN 0-313-33358-0 p. 73</ref>

In 1827, the German scientist [[Georg Ohm]] expressed his [[Ohm's law|law]] in this famous book ''"Die galvanische Kette, mathematisch bearbeitet"'' (The Galvanic Circuit Investigated Mathematically) in which he gave his complete theory of electricity.<ref name=ohm/>

In 1832, [[Michael Faraday]]'s experiments led him to state his two laws of electrochemistry. In 1836, [[John Frederic Daniell|John Daniell]] invented a primary cell in which [[hydrogen]] was eliminated in the generation of the electricity. Daniell had solved the problem of polarization. Later results revealed that [[alloy]]ing the [[Amalgam (chemistry)|amalgam]]ated [[zinc]] with [[Mercury (element)|mercury]] would produce a better voltage.
[[File:Arrhenius2.jpg|thumb|left|upright|Swedish chemist [[Svante Arrhenius]] portrait circa 1880s.]]
[[William Robert Grove|William Grove]] produced the first [[fuel cell]] in 1839. In 1846, [[Wilhelm Eduard Weber|Wilhelm Weber]] developed the [[electrodynamometer]]. In 1868, [[Georges Leclanché]] patented a new cell which eventually became the forerunner to the world's first widely used battery, the [[Zinc-carbon battery|zinc carbon cell]].<ref name=lai>Keith James Laidler [http://books.google.com/books?id=01LRlPbH80cC&pg=PA219 The world of physical chemistry], Oxford University Press (1995) ISBN 0-19-855919-4 pp. 219–220</ref>

[[Svante Arrhenius]] published his thesis in 1884 on ''Recherches sur la conductibilité galvanique des électrolytes'' (Investigations on the galvanic conductivity of electrolytes). From his results the author concluded that [[electrolyte]]s, when dissolved in water, become to varying degrees split or dissociated into electrically opposite positive and negative ions.<ref>Nobel Lectures, p. 59</ref>

In 1886, [[Paul Héroult]] and [[Charles Martin Hall|Charles M. Hall]] developed an efficient method (the [[Hall–Héroult process]]) to obtain [[aluminium]] using electrolysis of molten alumina.<ref>{{cite book|url = http://books.google.com/?id=td0jD4it63cC&pg=PT29|pages = 15–16|isbn = 978-0-7506-6371-7|chapter = Production of Aluminium|author = Polmear, I.J. |year = 2006|publisher = Elsevier/Butterworth-Heinemann|location = Oxford|title = Light alloys from traditional alloys to nanocrystals}}</ref>

In 1894, [[Wilhelm Ostwald|Friedrich Ostwald]] concluded important studies of the [[Conductivity (electrolytic)|conductivity]] and electrolytic dissociation of [[organic acid]]s.<ref>Nobel Lectures, p. 170</ref>
[[File:Walther Nernst 2.jpg|thumb|right|upright|German scientist [[Walther Nernst]] portrait in the 1910s.]]
[[Walther Nernst|Walther Hermann Nernst]] developed the theory of the [[electromotive force]] of the voltaic cell in 1888. In 1889, he showed how the characteristics of the current produced could be used to calculate the [[Thermodynamic free energy|free energy]] change in the chemical reaction producing the current. He constructed an equation, known as [[Nernst equation]], which related the voltage of a cell to its properties.<ref>Nobel Lectures, p. 363</ref>

In 1898, [[Fritz Haber]] showed that definite reduction products can result from electrolytic processes if the potential at the [[cathode]] is kept constant. In 1898, he explained the reduction of [[nitrobenzene]] in stages at the cathode and this became the model for other similar reduction processes.<ref>Nobel Lectures, p. 342</ref>

===20th century and recent developments===
In 1902, [[The Electrochemical Society]] (ECS) was founded.<ref>[http://www.electrochem.org/dl/hc/ ECS History Center]</ref>

In 1909, [[Robert Andrews Millikan]] began a series of experiments to determine the electric charge carried by a single [[electron]].<ref>
{{cite journal
|last=Millikan |first=Robert A.
|year=1911
|title=The Isolation of an Ion, a Precision Measurement of its Charge, and the Correction of Stokes' Law
|journal=Physical Review
|volume=32 |issue=2 |pages=349–397
|doi=10.1103/PhysRevSeriesI.32.349
|bibcode=1911PhRvI..32..349M
}}</ref>

In 1923, [[Johannes Nicolaus Brønsted]] and [[Martin Lowry]] published essentially the same theory about how acids and bases behave, using an electrochemical basis.<ref>William L. Masterton, Cecile N. Hurley [http://books.google.com/books?id=teubNK-b2bsC&pg=PT379 Chemistry: Principles and Reactions], Cengage Learning (2008) ISBN 0-495-12671-3 p. 379</ref>

[[Arne Tiselius]] developed the first sophisticated [[electrophoretic]] apparatus in 1937 and some years later he was awarded the 1948 [[Nobel Prize]] for his work in protein [[electrophoresis]].<ref>[http://nobelprize.org/nobel_prizes/chemistry/laureates/1948/tiselius-bio.html The Nobel Prize in Chemistry 1948 Arne Tiselius], nobelprize.org</ref>

A year later, in 1949, the [[International Society of Electrochemistry]] (ISE) was founded.<ref>[http://www.ise-online.org/geninfo/index.php The International Society of Electrochemistry]</ref>

By the 1960s–1970s [[quantum electrochemistry]] was developed by [[Revaz Dogonadze]] and his pupils.

==Principles==
===Redox reactions===
{{Main|Redox reaction}}
[[Redox]] stands for '''reduction-oxidation''', and are electrochemical processes involving [[electron]] transfer to or from a [[molecule]] or [[ion]] changing its [[oxidation state]]. This reaction can occur through the application of an external [[voltage]] or through the release of chemical energy.

===Oxidation and reduction===

Oxidation and reduction describe the change of oxidation state that takes place in the atoms, ions or molecules involved in an electrochemical [[chemical reaction|reaction]]. Formally, oxidation state is the hypothetical [[Electric charge|charge]] that an atom would have if all bonds to atoms of different elements were 100% [[Ionic bond|ionic]]. An atom or ion that gives up an electron to another atom or ion has its oxidation state increase, and the recipient of the negatively charged electron has its oxidation state decrease. Oxidation and reduction always occur in a paired fashion such that one species is oxidized when another is reduced. This paired electron transfer is called a [[redox]] reaction.

For example, when atomic [[sodium]] reacts with atomic [[chlorine]], sodium donates one electron and attains an oxidation state of +1. Chlorine accepts the electron and its oxidation state is reduced to −1. The sign of the oxidation state (positive/negative) actually corresponds to the value of each ion's electronic charge. The attraction of the differently charged sodium and chlorine ions is the reason they then form an [[ionic bond]].

The loss of electrons from an atom or molecule is called [[oxidation]], and the gain of electrons is [[redox|reduction]]. This can be easily remembered through the use of [[mnemonic]] devices. Two of the most popular are ''"OIL RIG"'' (Oxidation Is Loss, Reduction Is Gain) and ''"LEO"'' the lion says ''"GER"'' (Lose Electrons: Oxidization, Gain Electrons: Reduction). For cases where electrons are shared (covalent bonds) between atoms with large differences in [[electronegativity]], the electron is assigned to the atom with the largest electronegativity in determining the oxidation state.

The atom or molecule which loses electrons is known as the ''reducing agent'', or ''reductant'', and the substance which accepts the electrons is called the ''oxidizing agent'', or ''oxidant''. The oxidizing agent is always being reduced in a reaction; the reducing agent is always being oxidized. Oxygen is a common oxidizing agent, but not the only one. Despite the name, an oxidation reaction does not necessarily need to involve oxygen. In fact, a [[fire]] can be fed by an oxidant other than oxygen; [[fluorine]] fires are often unquenchable, as fluorine is an even stronger oxidant (it has a higher [[electronegativity]]) than oxygen.

For reactions involving oxygen, the gain of oxygen implies the oxidation of the atom or molecule to which the oxygen is added (and the oxygen is reduced). In organic compounds, such as [[butane]] or [[ethanol]], the loss of hydrogen implies oxidation of the molecule from which it is lost (and the hydrogen is reduced). This follows because the hydrogen donates its electron in covalent bonds with non-metals but it takes the electron along when it is lost. Conversely, loss of oxygen or gain of hydrogen implies reduction.

===Balancing redox reactions===
{{Main|Chemical equation}}
Electrochemical reactions in water are better understood by balancing redox reactions using the [[ion-electron method]] where [[hydronium|H+]], [[Hydroxide|OH–]] ion, [[Water (molecule)|H2O]] and electrons (to compensate the oxidation changes) are added to cell's [[half-reaction]]s for oxidation and reduction.

====Acidic medium====
In acid medium [[hydronium|H+]] ions and water are added to [[half-reaction]]s to balance the overall reaction.
For example, when [[manganese]] reacts with [[sodium bismuthate]].
:''Unbalanced reaction'': Mn2+(aq) + NaBiO3(s) → Bi3+(aq) + MnO4–(aq)
:''Oxidation'': 4 H2O(l) + Mn2+(aq) → MnO4–(aq) + 8 H+(aq) + 5 e–
:''Reduction'': 2 e– + 6 H+(aq) + BiO3–(s) → Bi3+(aq) + 3 H2O(l)

Finally, the reaction is balanced by [[multiplication|multiplying]] the number of electrons from the reduction half reaction to oxidation half reaction and vice versa and adding both half reactions, thus solving the equation.
:8 H2O(l) + 2 Mn2+(aq) → 2 MnO4–(aq) + 16 H+(aq) + 10 e–
:10 e– + 30 H+(aq) + 5 BiO3–(s) → 5 Bi3+(aq) + 15 H2O(l)
Reaction balanced:
:14 H+(aq) + 2 Mn2+(aq) + 5 NaBiO3(s) → 7 H2O(l) + 2 MnO4–(aq) + 5 Bi3+(aq) + 5 Na+(aq)

====Basic medium====
In basic medium [[Hydroxide|OH–]] ions and [[Water (molecule)|water]] are added to half reactions to balance the overall reaction. For example, on reaction between [[potassium permanganate]] and [[sodium sulfite]].
:''Unbalanced reaction'': KMnO4 + Na2SO3 + H2O → MnO2 + Na2SO4 + KOH
:''Reduction'': 3 e– + 2 H2O + MnO4– → MnO2 + 4 OH–
:''Oxidation'': 2 OH– + SO32– → SO42– + H2O + 2 e–

The same procedure as followed on acid medium by multiplying electrons to opposite half reactions solve the equation thus balancing the overall reaction.
:6 e– + 4 H2O + 2 MnO4– → 2 MnO2 + 8 OH–
:6 OH– + 3 SO32– → 3 SO42– + 3 H2O + 6e–
Equation balanced:
:2 KMnO4 + 3 Na2SO3 + H2O → 2 MnO2 + 3 Na2SO4 + 2 KOH

====Neutral medium====
The same procedure as used on acid medium is applied, for example on balancing using electron ion method to [[Combustion|complete combustion]] of [[propane]].
:''Unbalanced reaction'': C3H8 + O2 → CO2 + H2O
:''Reduction'': 4 H+ + O2 + 4 e– → 2 H2O
:''Oxidation'': 6 H2O + C3H8 → 3 CO2 + 20 e– + 20 H+

As in acid and basic medium, electrons which were used to compensate oxidation changes are multiplied to opposite half reactions, thus solving the equation.
:20 H+ + 5 O2 + 20 e– → 10 H2O
:6 H2O + C3H8 → 3 CO2 + 20 e– + 20 H+
Equation balanced:
:C3H8 + 5 O2 → 3 CO2 + 4 H2O

==Electrochemical cells==
{{Main|Electrochemical cell}}

An electrochemical cell is a device that produces an electric current from energy released by a [[Spontaneous process|spontaneous]] redox reaction. This kind of cell includes the [[Galvanic cell]] or Voltaic cell, named after [[Luigi Galvani]] and Alessandro Volta, both scientists who conducted several experiments on chemical reactions and electric current during the late 18th century.

Electrochemical cells have two conductive electrodes (the anode and the cathode). The [[anode]] is defined as the electrode where oxidation occurs and the [[cathode]] is the electrode where the reduction takes place. Electrodes can be made from any sufficiently conductive materials, such as metals, semiconductors, graphite, and even [[conductive polymer]]s. In between these electrodes is the [[electrolyte]], which contains ions that can freely move.

The galvanic cell uses two different metal electrodes, each in an electrolyte where the positively charged ions are the oxidized form of the electrode metal. One electrode will undergo oxidation (the anode) and the other will undergo reduction (the cathode). The metal of the anode will oxidize, going from an oxidation state of 0 (in the solid form) to a positive oxidation state and become an ion. At the cathode, the metal ion in solution will accept one or more electrons from the cathode and the ion's oxidation state is reduced to 0. This forms a solid metal that [[electroplating|electrodeposits]] on the cathode. The two electrodes must be electrically connected to each other, allowing for a flow of electrons that leave the metal of the anode and flow through this connection to the ions at the surface of the cathode. This flow of electrons is an electrical current that can be used to do work, such as turn a motor or power a light.

A galvanic cell whose [[electrode]]s are [[zinc]] and [[copper]] submerged in [[zinc sulfate]] and [[copper sulfate]], respectively, is known as a [[Daniell cell]].<ref name=w215/>

Half reactions for a Daniell cell are these:<ref name=w215/>
:Zinc electrode (anode): Zn(s) → Zn2+(aq) + 2 e–
:Copper electrode (cathode): Cu2+(aq) + 2 e– → Cu(s)
[[File:BASi epsilon C3 cell stand.jpg|thumb|right|A modern cell stand for electrochemical research. The electrodes attach to high-quality metallic wires, and the stand is attached to a [[potentiostat]]/[[galvanostat]] (not pictured). A [[shot glass]]-shaped container is [[Aerated water|aerated]] with a noble gas and sealed with the [[Polytetrafluoroethylene|Teflon]] block.]]

In this example, the anode is zinc metal which oxidizes (loses electrons) to form zinc ions in solution, and copper ions accept electrons from the copper metal electrode and the ions deposit at the copper cathode as an electrodeposit. This cell forms a simple battery as it will spontaneously generate a flow of electrical current from the anode to the cathode through the external connection. This reaction can be driven in reverse by applying a voltage, resulting in the deposition of zinc metal at the anode and formation of copper ions at the cathode.<ref name=w215/>

To provide a complete electric circuit, there must also be an ionic conduction path between the anode and cathode electrolytes in addition to the electron conduction path. The simplest ionic conduction path is to provide a liquid junction. To avoid mixing between the two electrolytes, the liquid junction can be provided through a porous plug that allows ion flow while reducing electrolyte mixing. To further minimize mixing of the electrolytes, a [[salt bridge]] can be used which consists of an electrolyte saturated gel in an inverted U-tube. As the negatively charged electrons flow in one direction around this circuit, the positively charged metal ions flow in the opposite direction in the electrolyte.

A [[galvanometer|voltmeter]] is capable of measuring the change of [[Electric potential|electrical potential]] between the anode and the cathode.

Electrochemical cell voltage is also referred to as [[electromotive force]] or emf.

A cell diagram can be used to trace the path of the electrons in the electrochemical cell. For example, here is a cell diagram of a Daniell cell:
:Zn(s) | Zn2+ (1M) || Cu2+ (1M) | Cu(s)

First, the reduced form of the metal to be oxidized at the anode (Zn) is written. This is separated from its oxidized form by a vertical line, which represents the limit between the phases (oxidation changes). The double vertical lines represent the saline bridge on the cell. Finally, the oxidized form of the metal to be reduced at the cathode, is written, separated from its reduced form by the vertical line. The electrolyte concentration is given as it is an important variable in determining the cell potential.

==Standard electrode potential==
{{Main|Standard electrode potential}}

To allow prediction of the cell potential, tabulations of [[standard electrode potential]] are available. Such tabulations are referenced to the standard hydrogen electrode (SHE). The [[standard hydrogen electrode]] undergoes the reaction
:2 H+(aq) + 2 e– → H2
which is shown as reduction but, in fact, the SHE can act as either the anode or the cathode, depending on the relative oxidation/reduction potential of the other electrode/electrolyte combination. The term standard in SHE requires a supply of hydrogen gas bubbled through the electrolyte at a pressure of 1 atm and an acidic electrolyte with H+ activity equal to 1 (usually assumed to be [H+] = 1 mol/liter).

The SHE electrode can be connected to any other electrode by a salt bridge to form a cell. If the second electrode is also at standard conditions, then the measured cell potential is called the standard electrode potential for the electrode. The standard electrode potential for the SHE is zero, by definition. The polarity of the standard electrode potential provides information about the relative reduction potential of the electrode compared to the SHE. If the electrode has a positive potential with respect to the SHE, then that means it is a strongly reducing electrode which forces the SHE to be the anode (an example is Cu in aqueous CuSO4 with a standard electrode potential of 0.337 V). Conversely, if the measured potential is negative, the electrode is more oxidizing than the SHE (such as Zn in ZnSO4 where the standard electrode potential is −0.76 V).<ref name=w215>Wiberg, pp. 215–216</ref>

Standard electrode potentials are usually tabulated as reduction potentials. However, the reactions are reversible and the role of a particular electrode in a cell depends on the relative oxidation/reduction potential of both electrodes. The oxidation potential for a particular electrode is just the negative of the reduction potential. A standard cell potential can be determined by looking up the standard electrode potentials for both electrodes (sometimes called half cell potentials). The one that is smaller will be the anode and will undergo oxidation. The cell potential is then calculated as the sum of the reduction potential for the cathode and the oxidation potential for the anode.

:E°cell = E°red(cathode) – E°red(anode) = E°red(cathode) + E°oxi(anode)

For example, the standard electrode potential for a copper electrode is:

:''Cell diagram''
:Pt(s) | H2(1 atm) | H+(1 M) || Cu2+ (1 M) | Cu(s)
:E°cell = E°red(cathode) – E°red(anode)

At standard temperature, pressure and concentration conditions, the cell's [[electromotive force|emf]] (measured by a [[multimeter]]) is 0.34 V. By definition, the electrode potential for the SHE is zero. Thus, the Cu is the cathode and the SHE is the anode giving
:Ecell = E°(Cu2+/Cu) – E°(H+/H2)
Or,
:E°(Cu2+/Cu) = 0.34 V

Changes in the [[stoichiometric coefficient]]s of a balanced cell equation will not change E°red value because the standard electrode potential is an [[Intensive and extensive properties|intensive property]].

==Spontaneity of redox reaction==
{{Main|Spontaneous process}}

During operation of [[electrochemical cell]]s, [[chemical energy]] is transformed into [[electrical energy]] and is expressed mathematically as the product of the cell's emf and the [[electric charge]] transferred through the external circuit.
:Electrical energy = EcellCtrans
where Ecell is the cell potential measured in volts (V) and Ctrans is the cell current integrated over time and measured in coulombs (C); Ctrans can also be determined by multiplying the total number of electrons transferred (measured in moles) times [[Faraday's constant]] (F).

The emf of the cell at zero current is the maximum possible emf. It is used to calculate the maximum possible electrical energy that could be obtained from a [[chemical reaction]]. This energy is referred to as [[electrical work]] and is expressed by the following equation:

:Wmax = Welectrical = –nF·Ecell,
where work is defined as positive into the system.

Since the [[Thermodynamic free energy|free energy]] is the maximum amount of work that can be extracted from a system, one can write:<ref name=s308>Swaddle, pp. 308–314</ref>
:ΔG = –nF·Ecell

A positive cell potential gives a negative change in Gibbs free energy. This is consistent with the cell production of an [[electric current]] from the cathode to the anode through the external circuit. If the current is driven in the opposite direction by imposing an external potential, then work is done on the cell to drive electrolysis.<ref name=s308/>

A [[Spontaneous process|spontaneous]] electrochemical reaction (change in Gibbs free energy less than zero) can be used to generate an electric current in electrochemical cells. This is the basis of all batteries and [[fuel cell]]s. For example, gaseous oxygen (O2) and
hydrogen (H2) can be combined in a fuel cell to form water and energy, typically a combination of heat and electrical energy.<ref name=s308/>

Conversely, non-spontaneous electrochemical reactions can be driven forward by the application of a current at sufficient [[voltage]]. The [[electrolysis]] of water into gaseous oxygen and hydrogen is a typical example.

The relation between the [[equilibrium constant]], ''K'', and the Gibbs free energy for an electrochemical cell is expressed as follows:

:ΔG° = –RT ln(K) = –nF·E°cell

Rearranging to express the relation between standard potential and equilibrium constant yields

:<math>\mbox{E}^{o}_{cell}={\mbox{RT} \over \mbox{nF}} \mbox{ln K}\,</math>.
The previous equation can use [[Briggsian logarithm]] as shown below:
:<math>\mbox{E}^{o}_{cell}={0.0591 \mbox{V} \over \mbox{n}} \mbox{log K}\,</math>

==Cell emf dependency on changes in concentration==
===Nernst equation===
{{Main|Nernst equation}}

The standard potential of an electrochemical cell requires standard conditions for all of the reactants. When reactant concentrations differ from standard conditions, the cell potential will deviate from the standard potential. In the 20th century German [[chemist]] [[Walther Nernst]] proposed a mathematical model to determine the effect of reactant concentration on electrochemical cell potential.

In the late 19th century, [[Josiah Willard Gibbs]] had formulated a theory to predict whether a chemical reaction is spontaneous based on the free energy

:ΔG = ΔG° + RT·ln(Q)

Here ''ΔG'' is change in [[Gibbs free energy]], ''T'' is absolute [[temperature]], ''R'' is the [[gas constant]] and ''Q'' is [[reaction quotient]].

Gibbs' key contribution was to formalize the understanding of the effect of reactant concentration on spontaneity.

Based on Gibbs' work, Nernst extended the theory to include the contribution from electric potential on charged species. As shown in the previous section, the change in Gibbs free energy for an electrochemical cell can be related to the cell potential. Thus, Gibbs' theory becomes

:nFΔE = nFΔE° – RT ln(Q)

Here ''n'' is the number of [[electron]]s/[[Mole (unit)|mole]] product, ''F'' is the [[Faraday constant]] ([[coulomb]]s/[[Mole (unit)|mole]]), and ''ΔE'' is [[cell potential]].

Finally, Nernst divided through by the amount of charge transferred to arrive at a new equation which now bears his name:
:ΔE = ΔE° – (RT/nF)ln(Q)

Assuming standard conditions (T = 25 °C) and [[Universal gas constant|R]] = 8.3145 J/(K·mol), the equation above can be expressed on [[Common logarithm|base—10 logarithm]] as shown below:<ref name=w210>Wiberg, pp. 210–212</ref>
:<math>\Delta E=\Delta E^{o}- {\mbox{0.05916 V} \over \mbox{n}} \mbox{log Q}\,</math>

===Concentration cells===
{{Main|Concentration cell}}
A concentration cell is an electrochemical cell where the two electrodes are the same material, the electrolytes on the two half-cells involve the same ions, but the electrolyte concentration differs between the two half-cells.

For example an electrochemical cell, where two copper electrodes are submerged in two [[copper(II) sulfate]] solutions, whose concentrations are 0.05 [[Molar concentration|M]] and 2.0 [[Molar concentration|M]], connected through a salt bridge. This type of cell will generate a potential that can be predicted by the Nernst equation. Both electrodes undergo the same chemistry (although the reaction proceeds in reverse at the cathode)

:Cu2+(aq) + 2 e– → Cu(s)

[[Le Chatelier's principle]] indicates that the reaction is more favorable to reduction as the concentration of Cu2+ ions increases. Reduction will take place in the cell's compartment where concentration is higher and oxidation will occur on the more dilute side.

The following cell diagram describes the cell mentioned above:
:Cu(s) | Cu2+ (0.05 M) || Cu2+ (2.0 M) | Cu(s)
Where the half cell reactions for oxidation and reduction are:
:''Oxidation'': Cu(s) → Cu2+ (0.05 M) + 2 e–
:''Reduction'': Cu2+ (2.0 M) + 2 e– → Cu(s)
:Overall reaction: Cu2+ (2.0 M) → Cu2+ (0.05 M)

The cell's emf is calculated through Nernst equation as follows:

:<math>E = E^{o}- {0.05916 V \over 2} log {[Cu^{2+}]_{diluted}\over [Cu^{2+}]_{concentrated}}\,</math>

The value of E° in this kind of cell is zero, as electrodes and ions are the same in both half-cells.

After replacing values from the case mentioned, it is possible to calculate cell's potential:
:<math>E = 0- {0.05916 V \over 2} log {0.05\over 2.0}= 0.0474{ } V\,</math>

or by:
:<math>E = 0- {0.0257 V \over 2} ln {0.05\over 2.0}= 0.0474{ } V\,</math>

However, this value is only approximate, as reaction quotient is defined in terms of ion activities which can be approximated with the concentrations as calculated here.

The Nernst equation plays an important role in understanding electrical effects in cells and organelles. Such effects include nerve [[synapses]] and [[cardiac cycle|cardiac beat]] as well as the resting potential of a somatic cell.

==Battery==
{{Main|Battery (electricity)}}

Many types of battery have been commercialized and represent an important practical application of electrochemistry. Early [[wet cell]]s powered the first [[Electrical telegraph|telegraph]] and [[telephone]] systems, and were the source of current for [[electroplating]]. The zinc-manganese dioxide [[dry cell]] was the first portable, non-spillable battery type that made [[flashlight]]s and other portable devices practical. The [[mercury battery]] using zinc and mercuric oxide provided higher levels of power and capacity than the original dry cell for early electronic devices, but has been phased out of common use due to the danger of mercury pollution from discarded cells.

The [[lead acid]] battery was the first practical secondary (rechargeable) battery that could have its capacity replenished from an external source. The electrochemical reaction that produced current was (to a useful degree) reversible, allowing electrical energy and chemical energy to be interchanged as needed. Lead-acid cells continue to be widely used in automobiles.

All the preceding types have water-based electrolytes, which limits the maximum voltage per cell. The freezing of water limits low temperature performance. The [[lithium battery]], which does not (and cannot) use water in the electrolyte, provides improved performance over other types; a rechargeable [[lithium ion battery]] is an essential part of many mobile devices.

The [[flow battery]], an experimental type, offers the option of vastly larger energy capacity because its reactants can be replenished from external reservoirs. The [[fuel cell]] can turn the chemical energy bound in hydrocarbon gases or hydrogen directly into electrical energy with much higher efficiency than any combustion process; such devices have powered many spacecraft and are being applied to [[grid energy storage]] for the public power system.

==Corrosion==
{{Main|Corrosion}}

Corrosion is the term applied to [[steel]] [[rust]] caused by an electrochemical process. Most people are likely familiar with the corrosion of [[iron]], in the form of reddish rust. Other examples include the black tarnish on [[silver]], and red or green corrosion that may appear on [[copper]] and its alloys, such as [[brass]]. The cost of replacing metals lost to corrosion is in the multi-billions of [[United States dollar|dollars]] per year.

===Iron corrosion===
For iron rust to occur the metal has to be in contact with [[oxygen]] and [[water]], although [[chemical reaction]]s for this process are relatively complex and not all of them are completely understood, it is believed the causes are the following:
Electron transferring (reduction-oxidation)
:One area on the surface of the metal acts as the anode, which is where the oxidation (corrosion) occurs. At the anode, the metal gives up electrons.
::Fe(s) → Fe2+(aq) + 2 e–
:[[Electron]]s are transferred from [[iron]] reducing oxygen in the [[atmosphere]] into [[water (molecule)|water]] on the cathode, which is placed in another region of the metal.
::O2(g) + 4 H+(aq) + 4 e– → 2 H2O(l)
:Global reaction for the process:
::2 Fe(s) + O2(g) + 4 H+(aq) → 2 Fe2+(aq) + 2 H2O(l)
:Standard emf for iron rusting:
::E° = E°cathode – E°anode
::E° = 1.23V – (−0.44 V) = 1.67 V
Iron corrosion takes place on acid medium; [[hydronium|H+]] [[ion]]s come from reaction between [[carbon dioxide]] in the atmosphere and water, forming [[carbonic acid]]. Fe2+ ions oxides, following this equation:
:4 Fe2+(aq) + O2(g) + (4+2x)H2O(l) → 2 Fe2O3·xH2O + 8 H+(aq)
[[Iron(III) oxide]] [[hydrated]] is known as rust. The concentration of water associated with iron oxide varies, thus chemical representation is presented as Fe2O3·xH2O.
The [[electric circuit]] works as passage of electrons and ions occurs, thus if an electrolyte is present it will facilitate [[oxidation]], this explains why rusting is quicker on [[brine|salt water]].

===Corrosion of common metals===
[[Coinage metal]]s, such as copper and silver, slowly corrode through use.
A [[patina]] of green-blue [[copper carbonate]] forms on the surface of [[copper]] with exposure to the water and carbon dioxide in the air. [[Silver]] coins or [[cutlery]] that are exposed to high sulfur foods such as [[Egg (food)|egg]]s or the low levels of sulfur species in the air develop a layer of black [[Silver sulfide]].

[[Gold]] and [[platinum]] are extremely difficult to oxidize under normal circumstances, and require exposure to a powerful chemical oxidizing agent such as [[aqua regia]].

Some common metals oxidize extremely rapidly in air. [[Titanium]] and aluminium oxidize instantaneously in contact with the oxygen in the air. These metals form an extremely thin layer of oxidized metal on the surface. This thin layer of oxide protects the underlying layers of the metal from the air preventing the entire metal from oxidizing. These metals are used in applications where corrosion resistance is important. [[Iron]], in contrast, has an oxide that forms in air and water, called [[rust]], that does not stop the further oxidation of the iron. Thus iron left exposed to air and water will continue to rust until all of the iron is oxided.

===Prevention of corrosion===
Attempts to save a metal from becoming anodic are of two general types. Anodic regions dissolve and destroy the structural integrity of the metal.

While it is almost impossible to prevent anode/[[cathode]] formation, if a [[Insulator (electrical)|non-conducting]] material covers the metal, contact with the [[electrolyte]] is not possible and corrosion will not occur.

====Coating====
Metals can be coated with [[paint]] or other less conductive metals (''[[Passivation (chemistry)|passivation]]''). This prevents the metal surface from being exposed to [[electrolyte]]s. Scratches exposing the metal substrate will result in corrosion. The region under the coating adjacent to the scratch acts as the [[anode]] of the reaction.

====Sacrificial anodes====
{{Main|Sacrificial anode}}
A method commonly used to protect a structural metal is to attach a metal which is more anodic than the metal to be protected. This forces the structural metal to be [[cathodic]], thus spared corrosion. It is called ''"sacrificial"'' because the anode dissolves and has to be replaced periodically.

[[Zinc]] bars are attached to various locations on steel [[ship]] [[Hull (watercraft)|hulls]] to render the ship hull [[cathode|cathodic]]. The zinc bars are replaced periodically. Other metals, such as [[magnesium]], would work very well but zinc is the least expensive useful metal.

To protect pipelines, an ingot of buried or exposed magnesium (or zinc) is [[bury|buried]] beside the [[Pipe (material)|pipeline]] and is [[wire|connected electrically]] to the pipe above ground. The pipeline is forced to be a cathode and is protected from being oxidized and rusting. The magnesium anode is sacrificed. At intervals new [[ingot]]s are buried to replace those lost.

==Electrolysis==
{{Main|Electrolysis}}

The spontaneous redox reactions of a conventional battery produce electricity through the different chemical potentials of the cathode and anode in the electrolyte. However, electrolysis requires an external source of [[electrical energy]] to induce a chemical reaction, and this process takes place in a compartment called an [[electrolytic cell]].

===Electrolysis of molten sodium chloride===

When molten, the salt [[sodium chloride]] can be electrolyzed to yield metallic [[sodium]] and gaseous [[chlorine]]. Industrially this process takes place in a special cell named Down's cell. The cell is connected to an electrical power supply, allowing [[electron]]s to migrate from the power supply to the electrolytic cell.<ref name=e800>Ebbing, pp. 800–801</ref>

Reactions that take place at Down's cell are the following:<ref name=e800/>
:Anode (oxidation): 2 Cl– → Cl2(g) + 2 e–
:Cathode (reduction): 2 Na+(l) + 2 e– → 2 Na(l)
:Overall reaction: 2 Na+ + 2 Cl–(l) → 2 Na(l) + Cl2(g)

This process can yield large amounts of metallic sodium and gaseous chlorine, and is widely used on [[mineral dressing]] and [[metallurgy]] [[industry|industries]].

The [[Electromotive force|emf]] for this process is approximately −4 [[Volt|V]] indicating a (very) non-spontaneous process. In order for this reaction to occur the power supply should provide at least a potential of 4 V. However, larger voltages must be used for this reaction to occur at a high rate.

===Electrolysis of water===
{{Main|Electrolysis of water}}
Water can be converted to its component elemental gasses, H2 and O2 through the application of an external voltage. [[Water]] doesn't decompose into [[hydrogen]] and [[oxygen]] [[Spontaneous process|spontaneously]] as the [[Gibbs free energy]] for the process at standard conditions is about 474.4 kJ. The decomposition of water into hydrogen and oxygen can be performed in an electrolytic cell. In it, a pair of inert [[electrode]]s usually made of [[platinum]] immersed in water act as anode and cathode in the electrolytic process. The electrolysis starts with the application of an external voltage between the electrodes. This process will not occur except at extremely high voltages without an electrolyte such as [[sodium chloride]] or [[sulfuric acid]] (most used 0.1 [[Molar concentration|M]]).<ref name=w235/>

Bubbles from the gases will be seen near both electrodes. The following half reactions describe the process mentioned above:

:''Anode (oxidation)'': 2 H2O(l) → O2(g) + 4 H+(aq) + 4 e–
:''Cathode (reduction)'': 2 H2O(g) + 2 e– → H2(g) + 2 OH–(aq)
:''Overall reaction'': 2 H2O(l) → 2 H2(g) + O2(g)

Although strong acids may be used in the apparatus, the reaction will not net consume the acid. While this reaction will work at any conductive electrode at a sufficiently large potential, platinum [[catalysis|catalyzes]] both hydrogen and oxygen formation, allowing for relatively mild voltages (~2 V depending on the [[pH]]).<ref name=w235>Wiberg, pp. 235–239</ref>

===Electrolysis of aqueous solutions===
Electrolysis in an aqueous is a similar process as mentioned in electrolysis of water. However, it is considered to be a complex process because the contents in solution have to be analyzed in [[chemical reaction|half reactions]], whether reduced or oxidized.

====Electrolysis of a solution of sodium chloride====
The presence of water in a solution of [[sodium chloride]] must be examined in respect to its reduction and oxidation in both electrodes. Usually, water is electrolysed as mentioned in electrolysis of water yielding ''gaseous [[oxygen]] in the anode'' and gaseous [[hydrogen]] in the cathode. On the other hand, sodium chloride in water [[Dissociation (chemistry)|dissociates]] in Na+ and Cl– ions, [[cation]], which is the positive ion, will be attracted to the cathode (+), thus reducing the [[sodium]] ion. The [[anion]] will then be attracted to the anode (–) oxidizing [[chloride]] ion.<ref name=nacl>Ebbing, pp. 837–839</ref>

The following half reactions describes the process mentioned:<ref name=nacl/>
:1. Cathode: Na+(aq) + e– → Na(s)     E°red = –2.71 V
:2. Anode: 2 Cl–(aq) → Cl2(g) + 2 e–     E°red = +1.36 V
:3. Cathode: 2 H2O(l) + 2 e– → H2(g) + 2 OH–(aq)    E°red = –0.83 V
:4. Anode: 2 H2O(l) → O2(g) + 4 H+(aq) + 4 e–    E°red = +1.23 V

Reaction 1 is discarded as it has the most [[Negative number|negative]] value on standard reduction potential thus making it less thermodynamically favorable in the process.

When comparing the reduction potentials in reactions 2 and 4, the reduction of chloride ion is favored. Thus, if the Cl– ion is favored for [[redox|reduction]], then the water reaction is favored for [[oxidation]] producing gaseous oxygen, however experiments show gaseous chlorine is produced and not oxygen.

Although the initial analysis is correct, there is another effect that can happen, known as the [[Overvoltage|overvoltage effect]]. Additional voltage is sometimes required, beyond the voltage predicted by the E°cell. This may be due to [[chemical kinetics|kinetic]] rather than [[Thermochemistry|thermodynamic]] considerations. In fact, it has been proven that the [[activation energy]] for the chloride ion is very low, hence favorable in [[chemical kinetics|kinetic terms]]. In other words, although the voltage applied is thermodynamically sufficient to drive electrolysis, the rate is so slow that to make the process proceed in a reasonable time frame, the [[voltage]] of the external source has to be increased (hence, overvoltage).<ref name=nacl/>

Finally, reaction 3 is favorable because it describes the proliferation of [[hydroxide|OH–]] ions thus letting a probable reduction of [[hydronium|H+]] ions less favorable an option.

The overall reaction for the process according to the analysis would be the following:<ref name=nacl/>
:Anode (oxidation): 2 Cl–(aq) → Cl2(g) + 2 e–
:Cathode (reduction): 2 H2O(l) + 2 e– → H2(g) + 2 OH–(aq)
:Overall reaction: 2 H2O + 2 Cl–(aq) → H2(g) + Cl2(g) + 2 OH–(aq)

As the overall reaction indicates, the [[concentration]] of chloride ions is reduced in comparison to OH– ions (whose concentration increases). The reaction also shows the production of gaseous [[hydrogen]], [[chlorine]] and aqueous [[sodium hydroxide]].

===Quantitative electrolysis and Faraday's laws===
{{Main|Faraday's law of electrolysis}}
Quantitative aspects of electrolysis were originally developed by [[Michael Faraday]] in 1834. Faraday is also credited to have coined the terms ''[[electrolyte]]'', electrolysis, among many others while he studied quantitative analysis of electrochemical reactions. Also he was an advocate of the [[law of conservation of energy]].

====First law====
Faraday concluded after several experiments on [[electrical current]] in [[spontaneous process|non-spontaneous process]], the [[mass]] of the products yielded on the electrodes was proportional to the value of current supplied to the cell, the length of time the current existed, and the molar mass of the substance analyzed. In other words, the amount of a substance deposited on each electrode of an electrolytic cell is directly proportional to the [[Electric charge|quantity of electricity]] passed through the cell.<ref>Wiberg, p. 65</ref>

Below is a simplified equation of Faraday's first law:
:<math>m \ = \ { 1 \over 96485 \ \mathrm{(C \cdot mol^{-1})} } \cdot { Q M \over n } </math>
Where
:''m'' is the mass of the substance produced at the electrode (in [[gram]]s),
:''Q'' is the total electric charge that passed through the solution (in [[coulomb]]s),
:''n'' is the valence number of the substance as an ion in solution (electrons per ion),
:''M'' is the molar mass of the substance (in grams per [[mole (unit)|mole]]).

====Second law====
{{Main|Electroplating}}
Faraday devised the laws of chemical electrodeposition of metals from solutions in 1857. He formulated the second law of electrolysis stating ''"the amounts of bodies which are equivalent to each other in their ordinary chemical action have equal quantities of electricity naturally associated with them."'' In other words, the quantities of different elements deposited by a given amount of electricity are in the [[ratio]] of their chemical [[equivalent weight]]s.<ref>[http://scienceworld.wolfram.com/biography/Faraday.html Faraday, Michael (1791–1867)], Wolfram Research</ref>

An important aspect of the second law of electrolysis is [[electroplating]] which together with the first law of electrolysis, has a significant number of applications in the industry, as when used to protect [[metal]]s to avoid [[corrosion]].

==Applications==

There are various extremely important electrochemical processes in both nature and industry, like the coating of objects with metals or metal oxides through electrodeposition and the detection of alcohol in drunken drivers through the redox reaction of ethanol. The generation of chemical energy through [[photosynthesis]] is inherently an electrochemical process, as is production of metals like aluminum and titanium from their ores. Certain diabetes blood sugar meters measure the amount of glucose in the blood through its redox potential.

The [[action potentials]] that travel down [[neurons]] are based on electric current generated by the movement of sodium and potassium ions into and out of cells. Specialized cells in certain animals like eels can generate electric currents powerful enough to disable much larger animals.

==See also==
{{Portal|Science}}
{{colbegin|3}}
*[[Reactivity series]]
*[[Bioelectromagnetism]]
*[[Bioelectrochemistry]]
*[[Contact tension]] – a historical forerunner to the theory of electrochemistry.
*[[Electrochemical impedance spectroscopy]]
*[[Electroanalytical method]]
*[[Electrochemical potential]]
*[[Electrochemiluminescence]]
*[[Electroplating]]
*[[Electrochemical engineering]]
*[[Electrochemical energy conversion]]
*[[Frost diagram]]
*[[List of important publications in chemistry#Electrochemistry|Important publications in electrochemistry]]
*[[Magnetoelectrochemistry]]
*[[Nanoelectrochemistry]]
*[[Photoelectrochemistry]]
*[[Pourbaix diagram]]
*[[Redox titration]]
*[[Standard electrode potential (data page)]]
*[[Voltammetry]]
*[[ITIES]]
{{colend}}

==References==
{{reflist|2}}

==Bibliography==
*Ebbing, Darrell D. and Gammon, Steven D. [http://books.google.com/books?id=_vRm5tiUJcsC&pg=PA837 General Chemistry] (2007) ISBN 0-618-73879-7,
*[http://books.google.com/books?id=NaVq4ztgsD8C&pg=PA59 Nobel Lectures in Chemistry], Volume 1, World Scientific (1999) ISBN 981-02-3405-8
*Swaddle, Thomas Wilson [http://books.google.com/books?id=hXpOtkYS5X4C&pg=PA316 Inorganic chemistry: an industrial and environmental perspective], Academic Press (1997) ISBN 0-12-678550-3
*Wiberg, Egon; Wiberg, Nils and Holleman, Arnold Frederick [http://books.google.com/books?id=Mtth5g59dEIC&pg=PA65 Inorganic chemistry], Academic Press (2001) ISBN 0-12-352651-5

==External links==
*{{Dmoz|Science/Chemistry/Electrochemistry|Electrochemistry}}
*[http://www.chem1.com/acad/webtext/elchem/ ''All about electrochemistry''] (online Reference Text for General Chemistry)
*[http://www.electrochem.org The Electrochemical Society]
*[http://electrochem.cwru.edu/estir/ Electrochemical Science and Technology Information Resource (ESTIR) ]
*[http://www.ise-online.org International Society of Electrochemistry (ISE)]

{{BranchesofChemistry}}
{{Analytical chemistry}}

[[Category:Electrochemistry| ]]
[[Category:Physical chemistry]]

{{Link FA|sl}}

[[ar:كيمياء كهربية]]
[[az:Elektrokimya]]
[[be:Электрахімія]]
[[be-x-old:Электрахімія]]
[[bg:Електрохимия]]
[[bs:Elektrohemija]]
[[ca:Electroquímica]]
[[cs:Elektrochemie]]
[[de:Elektrochemie]]
[[et:Elektrokeemia]]
[[el:Ηλεκτροχημεία]]
[[es:Electroquímica]]
[[eo:Elektrokemio]]
[[fa:الکتروشیمی]]
[[fr:Électrochimie]]
[[gl:Electroquímica]]
[[ko:전기화학]]
[[hi:विद्युत्-रसायन]]
[[hr:Elektrokemija]]
[[id:Elektrokimia]]
[[it:Elettrochimica]]
[[he:אלקטרוכימיה]]
[[kk:Электрохимия]]
[[la:Electrochemia]]
[[lb:Elektrochimie]]
[[lmo:Eletruchimica]]
[[hu:Elektrokémia]]
[[nl:Elektrochemie]]
[[ja:電気化学]]
[[no:Elektrokjemi]]
[[pl:Elektrochemia]]
[[pt:Eletroquímica]]
[[ro:Electrochimie]]
[[ru:Электрохимия]]
[[sq:Elektrokimia]]
[[scn:Alittrochìmica]]
[[simple:Electrochemistry]]
[[sl:Elektrokemija]]
[[sr:Elektrohemija]]
[[sh:Elektrokemija]]
[[su:Éléktrokimia]]
[[fi:Sähkökemia]]
[[sv:Elektrokemi]]
[[ta:மின்வேதியியல்]]
[[th:ไฟฟ้าเคมี]]
[[tr:Elektrokimya]]
[[uk:Електрохімія]]
[[vi:Điện hóa]]
[[zh:电化学]]

Outline of category theory

2012-07-23T14:06:03Z

Magmalex: /* External links */

The following outline is provided as an overview of and guide to category theory:

'''[[Category theory]]''' – area of study in [[mathematics]] that examines in an [[abstraction|abstract]] way the properties of particular mathematical concepts, by formalising them as collections of ''objects'' and ''arrows'' (also called [[morphism]]s, although this term also has a specific, non category-theoretical sense), where these collections satisfy certain basic conditions. Many significant areas of mathematics can be formalised as categories, and the use of category theory allows many intricate and subtle mathematical results in these fields to be stated, and proved, in a much simpler way than without the use of categories.

== Essence of category theory ==
{{Main|Category theory}}
* [[Category (mathematics)|Category]] –
* [[Functor]] –
* [[Natural transformation]] –

== Branches of category theory ==
* [[Homological algebra]] –
* [[Diagram chasing]] –
* [[Topos|Topos theory]] –
* [[Enriched category]] theory –

== Specific categories ==
*[[Category of sets]] –
:*[[Concrete category]] –
*[[Category of vector spaces]] –
:*[[Category of graded vector spaces]] –
*[[Category of chain complexes]] –
*[[Category of finite dimensional Hilbert spaces]] –
*[[Category of sets and relations]] –
*[[Category of topological spaces]] –
*[[Category of metric spaces]] –
*[[Category of preordered sets]] –
*[[Category of groups]] –
*[[Category of abelian groups]] –
*[[Category of rings]] –
*[[Category of magmas]] –
*[[Category of medial magmas]] –

==Objects==
*[[Initial object]] –
*[[Terminal object]] –
*[[Zero object]] –
*[[Subobject]] –
*[[Group object]] –
*[[Magma object]] –
*[[Natural number object]] –
*[[Exponential object]] –

==Morphisms==
{{main|morphism}}
*[[Epimorphism]] –
*[[Monomorphism]] –
*[[Zero morphism]] –
*[[Normal morphism]] –
*[[Dual (category theory)]] –
*[[Groupoid]] –
*[[Image (category theory)]] –
*[[Coimage]] –
*[[Commutative diagram]] –
*[[Cartesian morphism]] –
*[[Slice category]] –

==Functors==
{{main|Functor}}
*[[Isomorphism of categories]] –
*[[Natural transformation]] –
*[[Equivalence of categories]] –
*[[Subcategory]] –
*[[Faithful functor]] –
*[[Full functor]] –
*[[Forgetful functor]] –
*[[Yoneda lemma]] –
*[[Representable functor]] –
*[[Functor category]] –
*[[Adjoint functors]] –
:*[[Galois connection]] –
:*[[Pontryagin duality]] –
:*[[Affine scheme]] –
*[[Monad (category theory)]] –
*[[Comonad]] –
*[[Combinatorial species]] –
*[[Exact functor]] –
*[[Derived functor]] –
*[[Enriched functor]] –
*[[Kan extension|Kan extension of a functor]] –
*[[Hom functor]] –

==Limits==
{{main|Limit (category theory)}}
:*[[Product (category theory)]] –
:*[[Equaliser (mathematics)]] –
:*[[Kernel (category theory)]] –
:*[[Pullback (category theory)]]/[[fiber product]] –
:*[[Inverse limit]] –
::*[[Pro-finite group]] –
*[[Colimit]] –
:*[[Coproduct]] –
:*[[Coequalizer]] –
:*[[Cokernel]] –
:*[[Pushout (category theory)]] –
:*[[Direct limit]] –
*[[Biproduct]] –
:*[[Direct sum]] –

==Additive structure==

*[[Preadditive category]] –
*[[Additive category]] –
*[[Pre-Abelian category]] –
*[[Abelian category]] –
:*[[Exact sequence]] –
:*[[Exact functor]] –
:*[[Snake lemma]] –
::*[[Nine lemma]] –
:*[[Five lemma]] –
::*[[Short five lemma]] –
:*[[Mitchell's embedding theorem]] –
*[[Injective cogenerator]] –
*[[Derived category]] –
*[[Triangulated category]] –
*[[Model category]] –
*[[2-category]] –
*[[Bicategory]] –

==Dagger categories==
{{main|Dagger category}}
*[[Dagger symmetric monoidal category]] –
*[[Dagger compact category]] –
*[[Strongly ribbon category]] –

==Monoidal categories==
{{main|monoidal category}}
*[[Closed monoidal category]] –
*[[Braided monoidal category]] –

==Cartesian closed category==

{{Empty section|date=July 2010}}
==Structure==
{{main|Structure (category theory)}}
*[[Semigroupoid]] –
*[[Comma category]] –
*[[Localization of a category]] –
*[[Enriched category]] –
*[[Bicategory]] –

==Topoi, toposes==
{{main|Topos}}
* [[Sheaf (mathematics)|Sheaf]] –
* [[Gluing axiom]] –
* [[Descent (category theory)]] –
* [[Grothendieck topology]] –
* [[Introduction to topos theory]] –
* [[Subobject classifier]] –
* [[Pointless topology]] –
* [[Heyting algebra]] –

== History of category theory ==
: ''Main article: [[Category_theory#Historical_notes|History of category theory]]''

== Persons influential in the field of category theory ==
=== Category theory scholars ===
== See also ==
{{portal|Category theory}}
*[[Abstract nonsense]] –
*[[Homological algebra]] –
*[[Glossary of category theory]] –

== References ==
{{reflist|2}}

== External links ==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* [[André Joyal]], [http://ncatlab.org/nlab CatLab], a wiki project dedicated to the exposition of categorical mathematics
* Chris Hillman, [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.24.3264&rep=rep1&type=pdf A Categorical Primer], formal introduction to category theory.
* J. Adamek, H. Herrlich, G. Stecker, [http://katmat.math.uni-bremen.de/acc/acc.pdf Abstract and Concrete Categories-The Joy of Cats]
* [[Stanford Encyclopedia of Philosophy]]: "[http://plato.stanford.edu/entries/category-theory/ Category Theory]" -- by Jean-Pierre Marquis. Extensive bibliography.
* [http://www.mta.ca/~cat-dist/ List of academic conferences on category theory]
* Baez, John, 1996,"[http://math.ucr.edu/home/baez/week73.html The Tale of ''n''-categories.]" An informal introduction to higher order categories.
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.
{{sisterlinks|Category theory}}

{{Outline footer}}

[[Category:Outlines|Category theory]]
[[Category:Mathematics-related lists|Category theory]]
[[Category:Category theory| ]]

Natural transformation

2012-07-23T13:59:16Z

Magmalex: Added external links

{{About|natural transformations in category theory|the natural competence of bacteria to take up foreign DNA|Transformation (genetics)}}
{{other uses|Transformation (mathematics) (disambiguation)}}
In [[category theory]], a branch of [[mathematics]], a '''natural transformation''' provides a way of transforming one [[functor]] into another while respecting the internal structure (i.e. the composition of [[morphism]]s) of the categories involved. Hence, a natural transformation can be considered to be a "morphism of functors". Indeed this intuition can be formalized to define so-called [[functor category|functor categories]]. Natural transformations are, after categories and functors, one of the most basic notions of [[category theory]] and consequently appear in the majority of its applications.

==Definition==
If ''F'' and ''G'' are [[functor]]s between the categories ''C'' and ''D'', then a '''natural transformation''' η from ''F'' to ''G'' associates to every object ''X'' in ''C'' a [[morphism]] {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} between objects of ''D'', called the '''component''' of η at ''X'', such that for every morphism {{nobreak|1=''f'' : ''X'' → ''Y'' in ''C''}} we have:

:<math>\eta_Y \circ F(f) = G(f) \circ \eta_X</math>

This equation can conveniently be expressed by the [[commutative diagram]]

[[File:Natural transformation.svg|175px]]

If both ''F'' and ''G'' are [[contravariant functor|contravariant]], the horizontal arrows in this diagram are reversed. If η is a natural transformation from ''F'' to ''G'', we also write {{nobreak|1=η : ''F'' → ''G''}} or {{nobreak|1=η : ''F'' ⇒ ''G''}}. This is also expressed by saying the family of morphisms {{nobreak|1=η''X'' : ''F''(''X'') → ''G''(''X'')}} is '''natural''' in ''X''.

If, for every object ''X'' in ''C'', the morphism η''X'' is an [[isomorphism]] in ''D'', then η is said to be a '''{{visible anchor|natural isomorphism}}''' (or sometimes '''natural equivalence''' or '''isomorphism of functors'''). Two functors ''F'' and ''G'' are called ''naturally isomorphic'' or simply ''isomorphic'' if there exists a natural isomorphism from ''F'' to ''G''.

An '''infranatural transformation''' η from ''F'' to ''G'' is simply a family of morphisms {{nobreak|1=η''X'': ''F''(''X'') → ''G''(''X'')}}. Thus a natural transformation is an infranatural transformation for which {{nobreak|1=η''Y'' ∘ ''F''(''f'') = ''G''(''f'') ∘ η''X''}} for every morphism {{nobreak|1=''f'' : ''X'' → ''Y''}}. The '''naturalizer''' of η, nat(η), is the largest [[subcategory]] of ''C'' containing all the objects of ''C'' on which η restricts to a natural transformation.

==Examples==
===Opposite group===
{{details|Opposite group}}
Statements such as
:"Every group is naturally isomorphic to its [[opposite group]]"
abound in modern mathematics. We will now give the precise meaning of this statement as well as its proof. Consider the category '''Grp''' of all [[group (mathematics)|group]]s with [[group homomorphism]]s as morphisms. If (''G'',*) is a group, we define its opposite group (''G''op,*op) as follows: ''G''op is the same set as ''G'', and the operation *op is defined by {{nobreak|1=''a'' *op ''b'' = ''b'' * ''a''}}. All multiplications in ''G''op are thus "turned around". Forming the [[Opposite category|opposite]] group becomes a (covariant!) functor from '''Grp''' to '''Grp''' if we define {{nobreak|1=''f''op = ''f''}} for any group homomorphism {{nobreak|1=''f'': ''G'' → ''H''}}. Note that ''f''op is indeed a group homomorphism from ''G''op to ''H''op:
:''f''op(''a'' *op ''b'') = ''f''(''b'' * ''a'') = ''f''(''b'') * ''f''(''a'') = ''f''op(''a'') *op ''f''op(''b'').
The content of the above statement is:
:"The identity functor {{nobreak|1=Id'''Grp''' : '''Grp''' → '''Grp'''}} is naturally isomorphic to the opposite functor {{nobreak|1=op : '''Grp''' → '''Grp'''}}."
To prove this, we need to provide isomorphisms {{nobreak|1=η''G'' : ''G'' → ''G''op}} for every group ''G'', such that the above diagram commutes. Set {{nobreak|1=η''G''(''a'') = ''a''−1}}. The formulas {{nobreak|1=(''ab'')−1 = ''b''−1 ''a''−1}} and {{nobreak|1=(''a''−1)−1 = ''a''}} show that η''G'' is a group homomorphism which is its own inverse. To prove the naturality, we start with a group homomorphism {{nobreak|1=''f'' : ''G'' → ''H''}} and show {{nobreak|1=η''H'' ∘ ''f'' = ''f''op ∘ η''G''}}, i.e. {{nobreak|1=(''f''(''a''))−1 = ''f''op(''a''−1)}} for all ''a'' in ''G''. This is true since {{nobreak|1=''f''op = ''f''}} and every group homomorphism has the property {{nobreak|1=(''f''(''a''))−1 = ''f''(''a''−1)}}.

===Double dual of a finite dimensional vector space===
If ''K'' is a [[field (mathematics)|field]], then for every [[vector space]] ''V'' over ''K'' we have a "natural" [[injective]] [[linear map]] {{nobreak|1=''V'' → ''V''**}} from the vector space into its [[double dual]]. These maps are "natural" in the following sense: the double dual operation is a functor, and the maps are the components of a natural transformation from the identity functor to the double dual functor.

===Counterexample: dual of a finite-dimensional vector space===
Every finite-dimensional vector space is isomorphic to its dual space, but this isomorphism relies on an arbitrary choice of isomorphism (for example, via choosing a basis and then taking the isomorphism sending this basis to the corresponding [[dual basis]]). There is in general no natural isomorphism between a finite-dimensional vector space and its dual space.<ref>{{harv|MacLane|Birkhoff|1999|loc=§VI.4}}</ref> However, related categories (with additional structure and restrictions on the maps) do have a natural isomorphism, as described below.

The dual space of a finite-dimensional vector space is again a finite-dimensional vector space of the same dimension, and these are thus isomorphic, since dimension is the only invariant of finite-dimensional vector spaces over a given field. However, in the absence of additional data (such as a basis), there is no given map from a space to its dual, and thus such an isomorphism requires a choice, and is "not natural". On the category of finite-dimensional vector spaces and linear maps, one can define an infranatural isomorphism from vector spaces to their dual by choosing an isomorphism for each space (say, by choosing a basis for every vector space and taking the corresponding isomorphism), but this will not define a natural transformation. Intuitively this is because it required a choice, rigorously because ''any'' such choice of isomorphisms will not commute with ''all'' linear maps; see {{harv|MacLane|Birkhoff|1999|loc=§VI.4}} for detailed discussion.

Starting from finite-dimensional vector spaces (as objects) and the dual functor, one can define a natural isomorphism, but this requires first adding additional structure, then restricting the maps from "all linear maps" to "linear maps that respect this structure". Explicitly, for each vector space, require that it come with the data of an isomorphism to its dual, <math>\eta_V\colon V \to V^*.</math> In other words, take as objects vector spaces with a [[nondegenerate bilinear form]] <math>b_V\colon V \times V \to K.</math> This defines an infranatural isomorphism (isomorphism for each object). One then restricts the maps to only those maps that commute with these isomorphism (restricts to the naturalizer of ''η''), in other words, restrict to the maps that do not change the bilinear form: <math>b(T(v),T(w))=b(v,w).</math> The resulting category, with objects finite-dimensional vector spaces with a nondegenerate bilinear form, and maps linear transforms that respect the bilinear form, by construction has a natural isomorphism from the identity to the dual (each space has an isomorphism to its dual, and the maps in the category are required to commute). Viewed in this light, this construction (add transforms for each object, restrict maps to commute with these) is completely general, and does not depend on any particular properties of vector spaces.

In this category (finite-dimensional vector spaces with a nondegenerate bilinear form, maps linear transforms that respect the bilinear form), the dual of a map between vector spaces can be identified as a [[transpose]]. Often for reasons of geometric interest this is specialized to a subcategory, by requiring that the nondegenerate bilinear forms have additional properties, such as being symmetric ([[orthogonal matrices]]), symmetric and positive definite ([[inner product space]]), symmetric sesquilinear ([[Hermitian space]]s), skew-symmetric and totally isotropic ([[symplectic vector space]]), etc. – in all these categories a vector space is naturally identified with its dual, by the nondegenerate bilinear form.

===Tensor-hom adjunction===
{{see|Tensor-hom adjunction|Adjoint functors}}
Consider the [[category of abelian groups|category '''Ab''' of abelian groups and group homomorphisms]]. For all abelian groups ''X'', ''Y'' and ''Z'' we have a group isomorphism
:{{nobreak|1=Hom(''X'' {{otimes}} ''Y'', ''Z'') → Hom(''X'', Hom(''Y'', ''Z''))}}.
These isomorphisms are "natural" in the sense that they define a natural transformation between the two involved functors {{nobreak|1='''Ab''' × '''Ab'''op × '''Ab'''op → '''Ab'''}}.

This is formally the [[tensor-hom adjunction]], and is an archetypal example of a pair of [[adjoint functors]]. Natural transformations arise frequently in conjunction with adjoint functors, and indeed, adjoint functors are defined by a certain natural isomorphism. Additionally, every pair of adjoint functors comes equipped with two natural transformations (generally not isomorphisms) called the ''unit'' and ''counit''.

== Unnatural isomorphism ==
The notion of a natural transformation is categorical, and states (informally) that a particular map between functors can be done consistently over an entire category. Informally, a particular map (esp. an isomorphism) between individual objects (not entire categories) is referred to as a "natural isomorphism", meaning implicitly that it is actually defined on the entire category, and defines a natural transformation of functors; formalizing this intuition was a motivating factor in the development of category theory. Conversely, a particular map between particular objects may be called an '''unnatural isomorphism''' (or "this isomorphism is not natural") if the map cannot be extended to a natural transformation on the entire category. Given an object ''X,'' a functor ''G'' (taking for simplicity the first functor to be the identity) and an isomorphism <math>\eta\colon X \to G(X),</math> proof of unnaturality is most easily shown by giving an automorphism <math>A\colon X \to X</math> that does not commute with this isomorphism (so <math>\eta \circ A \neq A \circ \eta</math>). More strongly, if one wishes to prove that ''X'' and ''G''(''X'') are not naturally isomorphic, without reference to a particular isomorphism, this requires showing that for ''any'' isomorphism ''η,'' there is some ''A'' with which it does not commute; in some cases a single automorphism ''A'' works for all candidate isomorphisms ''η,'' while in other cases one must show how to construct a different ''A''''η'' for each isomorphism. The maps of the category play a crucial role – any infranatural transform is natural if the only maps are the identity map, for instance.

This is similar (but more categorical) to concepts in group theory or module theory, where a given decomposition of an object into a direct sum is "not natural", or rather "not unique", as automorphisms exist that do not preserve the direct sum decomposition – see [[Structure theorem for finitely generated modules over a principal ideal domain#Uniqueness]] for example.

Some authors distinguish notationally, using ≅ for a natural isomorphism and ≈ for an unnatural isomorphism, reserving = for equality (usually equality of maps).

== Operations with natural transformations ==
If {{nobreak|1=η : ''F'' → ''G''}} and {{nobreak|1=ε : ''G'' → ''H''}} are natural transformations between functors {{nobreak|1=''F'',''G'',''H'' : ''C'' → ''D''}}, then we can compose them to get a natural transformation {{nobreak|1=εη : ''F'' → ''H''}}. This is done componentwise: {{nobreak|1=(εη)''X'' = ε''X''η''X''}}. This "vertical composition" of natural transformation is [[associative]] and has an identity, and allows one to consider the collection of all functors {{nobreak|1=''C'' → ''D''}} itself as a category (see below under [[#Functor categories|Functor categories]]).

Natural transformations also have a "horizontal composition". If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}} and {{nobreak|1=ε : ''J'' → ''K''}} is a natural transformation between functors {{nobreak|1=''J'',''K'' : ''D'' → ''E''}}, then the composition of functors allows a composition of natural transformations {{nobreak|1=ηε : ''JF'' → ''KG''}}. This operation is also associative with identity, and the identity coincides with that for vertical composition. The two operations are related by an identity which exchanges vertical composition with horizontal composition.

If {{nobreak|1=η : ''F'' → ''G''}} is a natural transformation between functors {{nobreak|1=''F'',''G'' : ''C'' → ''D''}}, and {{nobreak|1=''H'' : ''D'' → ''E''}} is another functor, then we can form the natural transformation {{nobreak|1=''H''η : ''HF'' → ''HG''}} by defining

:<math> (H \eta)_X = H \eta_X. </math>

If on the other hand {{nobreak|1=''K'' : ''B'' → ''C''}} is a functor, the natural transformation {{nobreak|1=η''K'' : ''FK'' → ''GK''}} is defined by

:<math> (\eta K)_X = \eta_{K(X)}.\, </math>

==Functor categories==

{{Main|Functor category}}
If ''C'' is any category and ''I'' is a [[small category]], we can form the [[functor category]] ''CI'' having as objects all functors from ''I'' to ''C'' and as morphisms the natural transformations between those functors. This forms a category since for any functor ''F'' there is an identity natural transformation {{nobreak|1=1''F'' : ''F'' → ''F''}} (which assigns to every object ''X'' the identity morphism on ''F''(''X'')) and the composition of two natural transformations (the "vertical composition" above) is again a natural transformation.

The [[isomorphism]]s in ''CI'' are precisely the natural isomorphisms. That is, a natural transformation {{nobreak|1=η : ''F'' → ''G''}} is a natural isomorphism if and only if there exists a natural transformation {{nobreak|1=ε : ''G'' → ''F''}} such that {{nobreak|1=ηε = 1''G''}} and {{nobreak|1=εη = 1''F''}}.

The functor category ''CI'' is especially useful if ''I'' arises from a [[directed graph]]. For instance, if ''I'' is the category of the directed graph {{nobreak|1=• → •}}, then ''CI'' has as objects the morphisms of ''C'', and a morphism between {{nobreak|1=φ : ''U'' → ''V''}} and {{nobreak|1=ψ : ''X'' → ''Y''}} in ''CI'' is a pair of morphisms {{nobreak|1=''f'' : ''U'' → ''X''}} and {{nobreak|1=''g'' : ''V'' → ''Y''}} in ''C'' such that the "square commutes", i.e. {{nobreak|1=ψ ''f'' = ''g'' φ}}.

More generally, one can build the [[2-category]] '''Cat''' whose
* 0-cells (objects) are the small categories,
* 1-cells (arrows) between two objects <math>C</math> and <math>D</math> are the functors from <math>C</math> to <math>D</math>,
* 2-cells between two 1-cells (functors) <math>F:C\to D</math> and <math>G:C\to D</math> are the natural transformations from <math>F</math> to <math>G</math>.
The horizontal and vertical compositions are the compositions between natural transformations described previously. A functor category <math>C^I</math> is then simply a hom-category in this category (smallness issues aside).

==Yoneda lemma==

{{Main|Yoneda lemma}}
If ''X'' is an object of a [[locally small category]] ''C'', then the assignment {{nobreak|1=''Y'' {{mapsto}} Hom''C''(''X'', ''Y'')}} defines a covariant functor {{nobreak|1=''F''''X'' : ''C'' → '''Set'''}}. This functor is called ''[[representable functor|representable]]'' (more generally, a representable functor is any functor naturally isomorphic to this functor for an appropriate choice of ''X''). The natural transformations from a representable functor to an arbitrary functor {{nobreak|1=''F'' : ''C'' → '''Set'''}} are completely known and easy to describe; this is the content of the [[Yoneda lemma]].

== Historical notes ==
{{Unreferenced section|date=October 2008}}
[[Saunders Mac Lane]], one of the founders of category theory, is said to have remarked, "I didn't invent categories to study functors; I invented them to study natural transformations." Just as the study of [[group (mathematics)|groups]] is not complete without a study of [[group homomorphism|homomorphisms]], so the study of categories is not complete without the study of [[functor]]s. The reason for Mac Lane's comment is that the study of functors is itself not complete without the study of natural transformations.

The context of Mac Lane's remark was the axiomatic theory of [[homology (mathematics)|homology]]. Different ways of constructing homology could be shown to coincide: for example in the case of a [[simplicial complex]] the groups defined directly would be isomorphic to those of the singular theory. What cannot easily be expressed without the language of natural transformations is how homology groups are compatible with morphisms between objects, and how two equivalent homology theories not only have the same homology groups, but also the same morphisms between those groups.

==Symbols used==
* {{unichar|2297|CIRCLED TIMES|html=}}

== See also ==
* [[Extranatural transformation]]

== References ==
{{Portal|Category theory}}
{{reflist}}
{{refbegin}}
*{{cite book | first = Saunders | last = Mac Lane | authorlink = Saunders Mac Lane | year = 1998 | title = [[Categories for the Working Mathematician]] | series = Graduate Texts in Mathematics '''5''' | edition = 2nd | publisher = Springer-Verlag | isbn = 0-387-98403-8}}
* {{citation|first1=Saunders|last1=MacLane|authorlink1=Saunders MacLane|first2=Garrett|last2=Birkhoff|authorlink2=Garrett Birkhoff|title=Algebra|edition=3rd|publisher=AMS Chelsea Publishing|year=1999|isbn=0-8218-1646-2}}.
{{refend}}

==External links==
* [http://ncatlab.org/nlab nLab], a wiki project on mathematics, physics and philosophy with emphasis on the ''n''-categorical point of view
* [[André Joyal]], [http://ncatlab.org/nlab CatLab], a wiki project dedicated to the exposition of categorical mathematics
* {{cite web | first = Chris | last = Hillman | title = A Categorical Primer | id = {{citeseerx|10.1.1.24.3264}} | postscript = : }} formal introduction to category theory.
* J. Adamek, H. Herrlich, G. Stecker, [http://katmat.math.uni-bremen.de/acc/acc.pdf Abstract and Concrete Categories-The Joy of Cats]
* [[Stanford Encyclopedia of Philosophy]]: "[http://plato.stanford.edu/entries/category-theory/ Category Theory]" -- by Jean-Pierre Marquis. Extensive bibliography.
* [http://www.mta.ca/~cat-dist/ List of academic conferences on category theory]
* Baez, John, 1996,"[http://math.ucr.edu/home/baez/week73.html The Tale of ''n''-categories.]" An informal introduction to higher order categories.
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*{{planetmath reference|id=5622|title=Category Theory}}
* [http://categorieslogicphysics.wikidot.com/events Video archive] of recorded talks relevant to categories, logic and the foundations of physics.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Functors]]

[[es:Transformación natural]]
[[fr:Transformation naturelle]]
[[it:Trasformazione naturale]]
[[nl:Natuurlijke transformatie]]
[[ja:自然変換]]
[[pl:Transformacja naturalna]]
[[ru:Естественное преобразование]]
[[sv:Naturlig transformation]]

Diagram (category theory)

2012-07-23T13:46:01Z

Magmalex: /* External Links */

In [[category theory]], a branch of mathematics, a '''diagram''' is the categorical analogue of an [[indexed family]] in [[set theory]]. The primary difference is that in the categorical setting one has [[morphism]]s that also need indexing. An indexed family of sets is a collection of sets, indexed by a fixed set; equivalently, a ''function'' from a fixed index ''set'' to the class of ''sets''. A diagram is a collection of objects and morphisms, indexed by a fixed category; equivalently, a ''functor'' from a fixed index ''category'' to some ''category''.

Diagrams are used in the definition of [[limit (category theory)|limit and colimits]] and the related notion of [[cone (category theory)|cone]]s.

==Definition==

Formally, a '''diagram''' of type ''J'' in a [[category (mathematics)|category]] ''C'' is a ([[Covariance and contravariance of functors|covariant]]) [[functor]]
:''D'' : ''J'' → ''C''
The category ''J'' is called the '''index category''' or the '''scheme''' of the diagram ''D''. The actual objects and morphisms in ''J'' are largely irrelevant, only the way in which they are interrelated matters. The diagram ''D'' is thought of as indexing a collection of objects and morphisms in ''C'' patterned on ''J''.

Although, technically, there is no difference between an individual ''diagram'' and a ''functor'' or between a ''scheme'' and a ''category'', the change in terminology reflects a change in perspective, just as in the set theoretic case: one fixes the index category, and allows the functor (and, secondarily, the target category) to vary.

One is most often interested in the case where the scheme ''J'' is a [[small category|small]] or even [[Finite set|finite]] category. A diagram is said to be '''small''' or '''finite''' whenever ''J'' is.

A morphism of diagrams of type ''J'' in a category ''C'' is a [[natural transformation]] between functors. One can then interpret the '''category of diagrams''' of type ''J'' in ''C'' as the [[functor category]] ''C''''J'', and a diagram is then an object in this category.

==Examples==

* If ''J'' is a (small) [[discrete category]], then a diagram of type ''J'' is essentially just an indexed family of objects in ''C'' (indexed by ''J'').

* If ''J'' is a [[poset category]] then a diagram of type ''J'' is a family of objects ''D''''i'' together with a unique morphism ''f''''ij'' : ''D''''i'' → ''D''''j'' whenever ''i'' ≤ ''j''. If ''J'' is [[directed set|directed]] then a diagram of type ''J'' is called a [[direct system (mathematics)|direct system]] of objects and morphisms. If the diagram is [[contravariant functor|contravariant]] then it is called an [[inverse system]].

* If <math>J = 0 \overrightarrow{\to} 1</math>, then a diagram of type ''J'' (<math>f,g\colon X \to Y</math>) is called "two parallel morphisms": its limit is an [[Equaliser (mathematics)|equalizer]], and its colimit is a [[coequalizer]].

* If ''J'' = -1 ← 0 → +1, then a diagram of type ''J'' (''A'' ← ''B'' → ''C'') is a [[span (category theory)|span]], and its colimit is a [[Pushout (category theory)|pushout]].

* If ''J'' = -1 → 0 ← +1, then a diagram of type ''J'' (''A'' → ''B'' ← ''C'') is a [[cospan]], and its limit is a [[Pullback (category theory)|pullback]].

==Cones and limits==

A [[cone (category theory)|cone]] with vertex ''N'' of a diagram ''D'' : ''J'' → ''C'' is a morphism from the constant diagram Δ(''N'') to ''D''. The constant diagram is the diagram which sends every object of ''J'' to an object ''N'' of ''C'' and every morphism to the identity morphism on ''N''.

The [[limit (category theory)|limit]] of a diagram ''D'' is a [[universal cone]] to ''D''. That is, a cone through which all other cones uniquely factor. If the limit exists in a category ''C'' for all diagrams of type ''J'' one obtains a functor
:lim : ''C''''J'' → ''C''
which sends each diagram to its limit.

Dually, the [[colimit]] of diagram ''D'' is a universal cone from ''D''. If the colimit exists for all diagrams of type ''J'' one has a functor
:colim : ''C''''J'' → ''C''
which sends each diagram to its colimit.

== Commutative diagrams ==
{{main|Commutative diagram}}

Diagrams and functor categories are often visualized by [[commutative diagrams]], particularly if the index category is a finite [[poset category]] with few elements: one draws a commutative diagram with a node for every object in the index category, and an arrow for a generating set of morphisms, omitting identity maps and morphisms that can be expressed as compositions. The commutativity corresponds to the uniqueness of a map between two objects in a poset category. Conversely, every commutative diagram represents a diagram (a functor from a poset index category) in this way.

Not every diagram commutes, as not every index category is a poset category:
most simply, the diagram of a single object with an endomorphism (<math>f\colon X \to X</math>), or with two parallel arrows (<math>\bullet \overrightarrow{\to} \bullet</math>; <math>f,g\colon X \to Y</math>) need not commute. Further, diagrams may be impossible (because infinite) or messy (because many objects or morphisms) to draw; however, schematic commutative diagrams (for subcategories of the index category, or with ellipses, such as for a directed system) are used to clarify such complex diagrams.

== See also ==

*[[Commutative diagram]]
*[[Functor category]]

=== Limits ===
*[[Colimit]]
*[[Cone (category theory)]]
*[[Limit (category theory)]]

=== Examples ===
* [[Indexed family]]
* [[Direct system (mathematics)|Direct system]]
* [[Inverse system]]
* [[Span (category theory)|Span]]
* [[Cospan]]

==References==

*{{cite book | last = Adámek | first = Jiří | coauthors = Horst Herrlich, and George E. Strecker | year = 1990 | url = http://katmat.math.uni-bremen.de/acc/acc.pdf | title = Abstract and Concrete Categories | publisher = John Wiley & Sons | isbn = 0-471-60922-6}} Now available as free on-line edition (4.2MB PDF).
* {{Cite book| last1=Barr| first1=Michael|authorlink1=Michael Barr (mathematician) | last2=Wells| first2=Charles| authorlink2=Charles Wells (mathematician) |year=2002| title=Toposes, Triples and Theories|url=http://www.tac.mta.ca/tac/reprints/articles/12/tr12.pdf|isbn=0-387-96115-1}} Revised and corrected free online version of ''Grundlehren der mathematischen Wissenschaften (278)'' Springer-Verlag, 1983).

==External Links==

* [http://mathworld.wolfram.com/DiagramChasing.html Diagram Chasing] at [[MathWorld]]
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s, [[universal properties]].
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Functors]]

[[ko:그림 (범주론)]]
[[nl:Diagram (categorietheorie)]]
[[pl:Diagram (teoria kategorii)]]

Diagram (category theory)

2012-07-23T13:44:36Z

Magmalex: /* Cones and limits */

In [[category theory]], a branch of mathematics, a '''diagram''' is the categorical analogue of an [[indexed family]] in [[set theory]]. The primary difference is that in the categorical setting one has [[morphism]]s that also need indexing. An indexed family of sets is a collection of sets, indexed by a fixed set; equivalently, a ''function'' from a fixed index ''set'' to the class of ''sets''. A diagram is a collection of objects and morphisms, indexed by a fixed category; equivalently, a ''functor'' from a fixed index ''category'' to some ''category''.

Diagrams are used in the definition of [[limit (category theory)|limit and colimits]] and the related notion of [[cone (category theory)|cone]]s.

==Definition==

Formally, a '''diagram''' of type ''J'' in a [[category (mathematics)|category]] ''C'' is a ([[Covariance and contravariance of functors|covariant]]) [[functor]]
:''D'' : ''J'' → ''C''
The category ''J'' is called the '''index category''' or the '''scheme''' of the diagram ''D''. The actual objects and morphisms in ''J'' are largely irrelevant, only the way in which they are interrelated matters. The diagram ''D'' is thought of as indexing a collection of objects and morphisms in ''C'' patterned on ''J''.

Although, technically, there is no difference between an individual ''diagram'' and a ''functor'' or between a ''scheme'' and a ''category'', the change in terminology reflects a change in perspective, just as in the set theoretic case: one fixes the index category, and allows the functor (and, secondarily, the target category) to vary.

One is most often interested in the case where the scheme ''J'' is a [[small category|small]] or even [[Finite set|finite]] category. A diagram is said to be '''small''' or '''finite''' whenever ''J'' is.

A morphism of diagrams of type ''J'' in a category ''C'' is a [[natural transformation]] between functors. One can then interpret the '''category of diagrams''' of type ''J'' in ''C'' as the [[functor category]] ''C''''J'', and a diagram is then an object in this category.

==Examples==

* If ''J'' is a (small) [[discrete category]], then a diagram of type ''J'' is essentially just an indexed family of objects in ''C'' (indexed by ''J'').

* If ''J'' is a [[poset category]] then a diagram of type ''J'' is a family of objects ''D''''i'' together with a unique morphism ''f''''ij'' : ''D''''i'' → ''D''''j'' whenever ''i'' ≤ ''j''. If ''J'' is [[directed set|directed]] then a diagram of type ''J'' is called a [[direct system (mathematics)|direct system]] of objects and morphisms. If the diagram is [[contravariant functor|contravariant]] then it is called an [[inverse system]].

* If <math>J = 0 \overrightarrow{\to} 1</math>, then a diagram of type ''J'' (<math>f,g\colon X \to Y</math>) is called "two parallel morphisms": its limit is an [[Equaliser (mathematics)|equalizer]], and its colimit is a [[coequalizer]].

* If ''J'' = -1 ← 0 → +1, then a diagram of type ''J'' (''A'' ← ''B'' → ''C'') is a [[span (category theory)|span]], and its colimit is a [[Pushout (category theory)|pushout]].

* If ''J'' = -1 → 0 ← +1, then a diagram of type ''J'' (''A'' → ''B'' ← ''C'') is a [[cospan]], and its limit is a [[Pullback (category theory)|pullback]].

==Cones and limits==

A [[cone (category theory)|cone]] with vertex ''N'' of a diagram ''D'' : ''J'' → ''C'' is a morphism from the constant diagram Δ(''N'') to ''D''. The constant diagram is the diagram which sends every object of ''J'' to an object ''N'' of ''C'' and every morphism to the identity morphism on ''N''.

The [[limit (category theory)|limit]] of a diagram ''D'' is a [[universal cone]] to ''D''. That is, a cone through which all other cones uniquely factor. If the limit exists in a category ''C'' for all diagrams of type ''J'' one obtains a functor
:lim : ''C''''J'' → ''C''
which sends each diagram to its limit.

Dually, the [[colimit]] of diagram ''D'' is a universal cone from ''D''. If the colimit exists for all diagrams of type ''J'' one has a functor
:colim : ''C''''J'' → ''C''
which sends each diagram to its colimit.

== Commutative diagrams ==
{{main|Commutative diagram}}

Diagrams and functor categories are often visualized by [[commutative diagrams]], particularly if the index category is a finite [[poset category]] with few elements: one draws a commutative diagram with a node for every object in the index category, and an arrow for a generating set of morphisms, omitting identity maps and morphisms that can be expressed as compositions. The commutativity corresponds to the uniqueness of a map between two objects in a poset category. Conversely, every commutative diagram represents a diagram (a functor from a poset index category) in this way.

Not every diagram commutes, as not every index category is a poset category:
most simply, the diagram of a single object with an endomorphism (<math>f\colon X \to X</math>), or with two parallel arrows (<math>\bullet \overrightarrow{\to} \bullet</math>; <math>f,g\colon X \to Y</math>) need not commute. Further, diagrams may be impossible (because infinite) or messy (because many objects or morphisms) to draw; however, schematic commutative diagrams (for subcategories of the index category, or with ellipses, such as for a directed system) are used to clarify such complex diagrams.

== See also ==

*[[Commutative diagram]]
*[[Functor category]]

=== Limits ===
*[[Colimit]]
*[[Cone (category theory)]]
*[[Limit (category theory)]]

=== Examples ===
* [[Indexed family]]
* [[Direct system (mathematics)|Direct system]]
* [[Inverse system]]
* [[Span (category theory)|Span]]
* [[Cospan]]

==References==

*{{cite book | last = Adámek | first = Jiří | coauthors = Horst Herrlich, and George E. Strecker | year = 1990 | url = http://katmat.math.uni-bremen.de/acc/acc.pdf | title = Abstract and Concrete Categories | publisher = John Wiley & Sons | isbn = 0-471-60922-6}} Now available as free on-line edition (4.2MB PDF).
* {{Cite book| last1=Barr| first1=Michael|authorlink1=Michael Barr (mathematician) | last2=Wells| first2=Charles| authorlink2=Charles Wells (mathematician) |year=2002| title=Toposes, Triples and Theories|url=http://www.tac.mta.ca/tac/reprints/articles/12/tr12.pdf|isbn=0-387-96115-1}} Revised and corrected free online version of ''Grundlehren der mathematischen Wissenschaften (278)'' Springer-Verlag, 1983).

==External Links==

* [http://mathworld.wolfram.com/DiagramChasing.html Diagram Chasing] at [[MathWorld]]
* [http://wildcatsformma.wordpress.com WildCats] is a category theory package for [[Mathematica]]. Manipulation and visualization of objects, [[morphism]]s, categories, [[functor]]s, [[natural transformation]]s.
* [http://www.youtube.com/user/TheCatsters The catsters], a YouTube channel about category theory.
*[http://www.j-paine.org/cgi-bin/webcats/webcats.php Interactive Web page] which generates examples of categorical constructions in the category of finite sets.

[[Category:Functors]]

[[ko:그림 (범주론)]]
[[nl:Diagram (categorietheorie)]]
[[pl:Diagram (teoria kategorii)]]