A formula is used for simple, brief and quick representation of a compound. For general acceptability and unambiguity, certain norms have been recommended by the IUPAC.
The empirical formula of a compound is primarily meant to express the simplest mole ratio of different elements present in alphabetical order with appropriate numerical subscripts. Carbon-containing compounds are exceptions to this norm where C and H are usually cited first and second, respectively; for example, C2H5Br, C10H10ClFe.
The molecular formula represents the actual composition of a compound consisting of discrete molecules and is more informative than empirical formula. The order of citation of the elements in such formula is mostly based on electronegativity criterion.
However, for most species containing more than two atoms, formal treatment as coordination compound is found convenient. The order of citation of central atoms is again based upon electronegativity. Ligands or their abbreviations are now arranged alphabetically. Where possible the ligand formula is so written that the donor atom is closest to the symbol of the central atom.
PBrCl2, [BH4]–, [Al(OH)(OH2)5]2+
A moiety of atoms functioning as an entity is not split and written as such, e.g.-
POBr3 (not PBr3O), UO2Br2 (not UBr2O2)
In the parent hydride system, the remaining un-substituted hydrogen atoms of the parent hydride are cited first, e.g. –
SiH2BrCl GeH2F2 B2H5Cl
For carbaboranes, boron atoms are cited first; carbon atoms replacing skeletal boron atoms are placed next regardless of what other elements are present, e.g., B3C2H4Br.
Formulae of inorganic oxoacids are conventionally written according to the traditional order of citation – replaceable hydrogen atoms (i.e., those bound to O), central atom, non- replaceable hydrogen atoms, finally oxygen. These acids may also be represented in the format of coordination compounds.
For chain compounds containing three or more different elements, the atom to atom connectivity of the elements is generally represented in the formula, though it may violate other ordering principles (alphabetical order or order based on electronegativity).
NCS– or SCN–; BrSCN; HOCN; HNCO
A compound may be treated as a generalized salt when its formula is not naturally assigned in the categories just described. All electropositive constituents precede all electronegative constituents (Table 2.3). Within each category, alphabetical order is followed –
MgCl(OH), Na[HPHO3], FeO(OH), Na(UO2)3[Zn(H2O)6](O2CH3)9
Deviation from alphabetical order within the same category is allowed if it is intended to emphasize similarities between compounds –
CaTiO3 and ZnTiO3.
Formulae of addition compounds and compounds which can be regarded as such (including clathrates and multiple salts), are written in order of increasing mole ratio of the components with the mole ratio expressed by Arabic numerals (except 1) before each component. The components are separated by a dot (.). Components with equal numbers are arranged alphabetically. Water, when present, is conventionally shown at last.
Al2(SO4)3. K2SO4.24H2O (not K2SO4. Al2(SO4)3.24H2O).
C6H6. NH3. Ni(CN)2
For isotopically modified compounds, the modified nuclide(s) is represented in their formula. Such compounds may be of two types—isotopically substituted compounds and isotopically labelled compounds.
An isotopically substituted compound has all the molecules substituted for the indicated nuclide(s) at each designated position. The mass numbers of substituted nuclides are shown by left superscript.
An isotopically labelled compound is essentially a mixture of one or more analogous isotopically substituted compound with an isotopically unmodified that is “normal” compound. When a unique isotopically substituted compound is added formally to the analogous ‘normal’ compound (i.e., isotopically unmodified), one gets a specifically labelled compound. This is indicated by enclosing the relevant nuclidic symbol(s) (with multiplicative subscripts, if any) in square brackets.
A mixture of specifically labelled compounds is classed as selectively labelled compounds. This is shown in formula by placing the symbol of the relevant nuclide(s) as a prefix in square bracket. No multiplicative subscript is used.
Besides empirical and molecular formulae, a structural formula gives partial or complete information about the connectivity of atoms and their spatial arrangement. A line formula showing the sequence of atomic symbols is the simplest representation, meaningful only to one who knows. For example –
A more complex structure may be represented in line formulae using appropriate enclosing marks ―
A formula or part of a formula which represents a molecular entity may be placed in enclosing marks. Except in case of repeating polymeric units, square brackets are used to enclose the entire formula. Thus one may represent sodium chloride as ―
NaCl stands for the solid sodium chloride with composition NaCl. [NaCl] represents the molecular compound consisting of one atom of sodium and one atom of chlorine; the additive name ‘chloridosodium’ may be used specifically for the molecular compound when it is looked upon as a coordination entity.
Several modifiers to all types of formulae are used to provide specific information like oxidation state (Roman numerals I, II, etc.), formal ionic charge, free radicals, optical rotations, structural descriptors (e.g., cis/trans) and excited states (in right superscript).