RCSB PDB Help

Chemical Similarity Search

Introduction

What is Chemical Similarity Search?

The Chemical Similarity Search option allows you to query the PDB archive using information about small molecules defined in the Chemical Component and BIRD chemical reference dictionaries, such as their molecular formula or chemical descriptors. The search results can be reported either as chemical components that match the query parameters or PDB structures that contain a molecule matching the query criteria (default).

Why run a Chemical Similarity Search?

When you have unique chemical information (e.g., a chemical formula or descriptor) you can use this information to find chemical components (e.g., drugs, inhibitors, modified residues, or building blocks such as amino acids, nucleotides, or sugars), so that it:

  • is similar to the formula or descriptor used in the query (perhaps one or two atoms/groups are different)
  • is part of a larger molecule (i.e., the specified formula/descriptor is a substructure)
  • exactly or very closely matches the formula or descriptor used in the query

The search can also be used to identify PDB structures that include the chemical component(s) which match or are similar to the query. These structures can then be examined to learn about the interactions of the component within the structure.

Documentation

There are a number of different options that can be combined to run a Chemical Similarity Search. These options are being listed here under 3 different sections:

  • Query - this will describe the option you have to input your query
  • Search - this will describe the types of searches that can be run - from exact match to query, all the way to substructure matches.
  • Results - this will describe options available for what you wish to see in the results page.

Query Options

The main ways for initiating a Chemical Similarity Search is either using a Chemical Formula, a descriptor for the molecule, or a 2D chemical drawing.

Chemical Formula

A chemical formula presents the chemical symbols of elements and numbers representing their proportions in the molecule. The order of element symbols in the formula is not important. For example, the input "O1 C12 N4 H28" will match a chemical component with formula "C12 H28 N4 O". Other symbols such as parenthesis, charge indicators may also be included in chemical formulae. Note that a Chemical Formula Search is case-sensitive, so including an uppercase I in the formula "NIC4" will indicate (Nitrogen, Iodine, Carbon4) while a lowercase I indicates "NiC4" (Nickel, Carbon4).

Descriptors

Two main Chemical Descriptors can be used for a chemical component (ligand):

  • SMILES (Simplified Molecular Input Line Entry Specification) are chemical notations that allow representation of chemical structures in a way that can be used by the computer. Beyond chemical element symbols, SMILES include a linear notation of molecular structure, including information about bond orders, ring structures, and stereochemistry. Note that SMILES generated by different software may be slightly different. Chemical similarity searches attempt to include most of these representations so SMILES based searches may return more matches than those initiated using InChI descriptors.
  • InChI (International Chemical Identifier) - is a standard textual identifier, developed by IUPAC (International Union of Pure and Applied Chemistry) and NIST (National Institute of Standards and Technology), to represent the chemical structure of molecules. This descriptor stores layers of information about the molecules atoms, bond connectivity, stereochemistry, charge etc.

Chemical Drawing

Chemical drawings displays atoms in a ligand, along with their connectivity, bond order, and chirality (as appropriate). You can use the Chemical Sketch tool to draw and edit the 2D structure of a ligand. The tool can automatically convert a chemical drawing into chemical descriptors (SMILES and InChI), and use them to find an exact match or similar molecule in the PDB. Click here to learn more about this tool.

Note: In comparing Chemical Similarity Search results returned from SMILES and InChI descriptor queries, the standard InChI may provide greater specificity. Depending on whether you wish to see fewer or more matches in your results you can use the appropriate descriptor for the search.

Search Options

Using Chemical Formulae

Begin by setting the Query Type to Formula (Figure 1)

  • Type the query formula in the text box for Chemical Similarity Search.
  • Select the Match Subset options box as appropriate.
Figure 1: Chemical Similarity Search using Chemical Formula
Figure 1: Chemical Similarity Search using Chemical Formula

By default the search will find chemical components whose formula exactly match the query. However, if the Match Subset option is selected, you can select a chemical formula subset which will match any portion of a chemical formula. This option is particularly useful in cases when you are searching for chemical components that include a specific set of elements in a particular ratio. (See example).

Using Chemical Descriptors

Begin by setting the Query Type to Descriptor (Figure 2)

  • Type/paste the chemical descriptor in the text box for Chemical Search.
  • Select the appropriate Descriptor Type (SMILES or InChI).
Figure 2: Chemical Similarity Search using Descriptors
Figure 2: Chemical Similarity Search using Descriptors

Note: The chemical similarity search descriptor is converted into one of the following a 2D representation for search:

  • fingerprints are ordered sets of binary digits (bits) that encode specific physicochemical and/or structural properties of the molecule, such as the presence of common functional groups or ring systems.
  • a graph where atoms and bonds in a molecule are mapped onto nodes and edges respectively. Information about atom connectivity, bond order etc. are also coded and used to compare/match different chemical structures.
  • Select the Match Type from the options available (see Figure 3):
Figure 3: Available ligand search matching types
Figure 3: Available ligand search matching types
  • Similar Ligands (Quick Screen) - This option uses quick fingerprint matching. The Tanimoto coefficient is used to compute the degree of similarity between a pair of fingerprints. The Tanimoto coefficient has a range from 0 to 1 where higher values indicate greater similarity in structures. Results of Similar Ligands search include molecules with scores exceeding 0.6 for TREE type fingerprints or 0.9 for MACCS type fingerprints. Note that a Tanimoto coefficient of 1 does not indicate a perfect match.
  • Similar Ligands (Stereospecific) - in this option the atom type, formal charge, bond order, as well as atom and bond chirality are used as matching criteria. Graph matching is performed on the subset of molecules that satisfy a fingerprint prefilter or screening search. Results will include isomorphic and substructure matches within this screened subset.
  • Similar Ligands (including Stereoisomers) in this option the atom type, formal charge, and bond order are used as matching criteria. Graph matching is performed on the subset of molecules that satisfy a fingerprint prefilter or screening search. Results will include isomorphic and substructure matches within this screened subset.
  • Substructure (Stereospecific) - in this option graph matching searches perform an exhaustive substructure search where atom type, formal charge, bond order, aromaticity, and atom/bond stereochemistry are used as matching criteria for the search type. Results may include ligands much larger than the query including BIRD molecules where the query molecule is part of the structure.
  • Substructure (including Stereoisomers) - in this option graph matching searches perform an exhaustive substructure search where atom type, formal charge, bond order, and aromaticity are used as matching criteria for this search type. Results may include ligands much larger than the query including BIRD molecules where the query molecule is part of the structure.
  • Exact match - in this option the atom type, formal charge, aromaticity, bond order, atom/bond stereochemistry, degree, ring membership, and hydrogen count are used as matching criteria for this search type. Results will include chemical components where the query and target graphs match exactly or are very similar. In some cases (especially with SMILES based searches) stereoisomers may also be included in the results.

Result Options

Display Results options before clicking on the query (green magnifying glass) icon.

  • Selecting the Structures option will list PDB entries that include the chemical components (ligands or monomers) that match the query
  • Selecting the Polymer Entity option will list polymer entities that that have matching monomers in the deposited sequences
  • Selecting the Non-polymer Entity option will list non-polymeric small molecules matching the query
  • Selecting the Molecular Definitions option will list the chemical components/ligands that match the query

Examples

1. Using Chemical Formula

2. Using Chemical Descriptors

a. Find ligands similar to and with the substructure of the chemical component VIB.

Chemical component VIB
Chemical component VIB

The SMILES search with Cc1c(sc[n+]1Cc2cnc(nc2N)C)CCO and Match Type

Note that the query descriptor has no chiral atom so there is no difference in the results with the Match Types that are stereospecific and those that include stereoisomers.

b. Find ligands similar to and with the substructure of the chemical component EF2.

Chemical component EF2
Chemical component EF2

The InChI search with InChI=1S/C13H10N2O4/c16-10-6-5-9(11(17)14-10)15-12(18)7-3-1-2-4-8(7)13(15)19/h1-4,9H,5-6H2,(H,14,16,17)/t9-/m0/s1 and Match Type

Note that the query descriptor has one chiral atom so the results of the Match Types including and excluding stereoisomers yield different results.



Please report any encountered broken links to info@rcsb.org
Last updated: 3/9/2023
seductrice.net
universo-virtual.com
buytrendz.net
thisforall.net
benchpressgains.com
qthzb.com
mindhunter9.com
dwjqp1.com
secure-signup.net
ahaayy.com
tressesindia.com
puresybian.com
krpano-chs.com
cre8workshop.com
hdkino.org
peixun021.com
qz786.com
utahperformingartscenter.org
worldqrmconference.com
shangyuwh.com
eejssdfsdfdfjsd.com
playminecraftfreeonline.com
trekvietnamtour.com
your-business-articles.com
essaywritingservice10.com
hindusamaaj.com
joggingvideo.com
wandercoups.com
wormblaster.net
tongchengchuyange0004.com
internetknowing.com
breachurch.com
peachesnginburlesque.com
dataarchitectoo.com
clientfunnelformula.com
30pps.com
cherylroll.com
ks2252.com
prowp.net
webmanicura.com
sofietsshotel.com
facetorch.com
nylawyerreview.com
apapromotions.com
shareparelli.com
goeaglepointe.com
thegreenmanpubphuket.com
karotorossian.com
publicsensor.com
taiwandefence.com
epcsur.com
mfhoudan.com
southstills.com
tvtv98.com
thewellington-hotel.com
bccaipiao.com
colectoresindustrialesgs.com
shenanddcg.com
capriartfilmfestival.com
replicabreitlingsale.com
thaiamarinnewtoncorner.com
gkmcww.com
mbnkbj.com
andrewbrennandesign.com
cod54.com
luobinzhang.com
faithfirst.net
zjyc28.com
tongchengjinyeyouyue0004.com
nhuan6.com
kftz5k.com
oldgardensflowers.com
lightupthefloor.com
bahamamamas-stjohns.com
ly2818.com
905onthebay.com
fonemenu.com
notanothermovie.com
ukrainehighclassescort.com
meincmagazine.com
av-5858.com
yallerdawg.com
donkeythemovie.com
corporatehospitalitygroup.com
boboyy88.com
miteinander-lernen.com
dannayconsulting.com
officialtomsshoesoutletstore.com
forsale-amoxil-amoxicillin.net
generictadalafil-canada.net
guitarlessonseastlondon.com
lesliesrestaurants.com
mattyno9.com
nri-homeloans.com
rtgvisas-qatar.com
salbutamolventolinonline.net
sportsinjuries.info
wedsna.com
rgkntk.com
bkkmarketplace.com
zxqcwx.com
breakupprogram.com
boxcardc.com
unblockyoutubeindonesia.com
fabulousbookmark.com
beat-the.com
guatemala-sailfishing-vacations-charters.com
magie-marketing.com
kingstonliteracy.com
guitaraffinity.com
eurelookinggoodapparel.com
howtolosecheekfat.net
marioncma.org
oliviadavismusic.com
shantelcampbellrealestate.com
shopleborn13.com
topindiafree.com
v-visitors.net
djjky.com
053hh.com
originbluei.com
baucishotel.com
33kkn.com
intrinsiqresearch.com
mariaescort-kiev.com
mymaguk.com
sponsored4u.com
crimsonclass.com
bataillenavale.com
searchtile.com
ze-stribrnych-struh.com
zenithalhype.com
modalpkv.com
bouisset-lafforgue.com
useupload.com
37r.net
autoankauf-muenster.com
bantinbongda.net
bilgius.com
brabustermagazine.com
indigrow.org
miicrosofts.net
mysmiletravel.com
selinasims.com
spellcubesapp.com
usa-faction.com
hypoallergenicdogsnames.com
dailyupdatez.com
foodphotographyreviews.com
cricutcom-setup.com
chprowebdesign.com
katyrealty-kanepa.com
tasramar.com
bilgipinari.org
four-am.com
indiarepublicday.com
inquick-enbooks.com
iracmpi.com
kakaschoenen.com
lsm99flash.com
nana1255.com
ngen-niagara.com
technwzs.com
virtualonlinecasino1345.com
wallpapertop.net
casino-natali.com
iprofit-internet.com
denochemexicana.com
eventhalfkg.com
medcon-taiwan.com
life-himawari.com
myriamshomes.com
nightmarevue.com
healthandfitnesslives.com
androidnews-jp.com
allstarsru.com
bestofthebuckeyestate.com
bestofthefirststate.com
bestwireless7.com
britsmile.com
declarationintermittent.com
findhereall.com
jingyou888.com
lsm99deal.com
lsm99galaxy.com
moozatech.com
nuagh.com
patliyo.com
philomenamagikz.net
rckouba.net
saturnunipessoallda.com
tallahasseefrolics.com
thematurehardcore.net
totalenvironment-inthatquietearth.com
velislavakaymakanova.com
vermontenergetic.com
kakakpintar.com
jerusalemdispatch.com
begorgeouslady.com
1800birks4u.com
2wheelstogo.com
6strip4you.com
bigdata-world.net
emailandco.net
gacapal.com
jharpost.com
krishnaastro.com
lsm99credit.com
mascalzonicampani.com
sitemapxml.org
thecityslums.net
topagh.com
flairnetwebdesign.com
rajasthancarservices.com
bangkaeair.com
beneventocoupon.com
noternet.org
oqtive.com
smilebrightrx.com
decollage-etiquette.com
1millionbestdownloads.com
7658.info
bidbass.com
devlopworldtech.com
digitalmarketingrajkot.com
fluginfo.net
naqlafshk.com
passion-decouverte.com
playsirius.com
spacceleratorintl.com
stikyballs.com
top10way.com
yokidsyogurt.com
zszyhl.com
16firthcrescent.com
abogadolaboralistamd.com
apk2wap.com
aromacremeria.com
banparacard.com
bosmanraws.com
businessproviderblog.com
caltonosa.com
calvaryrevivalchurch.org
chastenedsoulwithabrokenheart.com
cheminotsgardcevennes.com
cooksspot.com
cqxzpt.com
deesywig.com
deltacartoonmaps.com
despixelsetdeshommes.com
duocoracaobrasileiro.com
fareshopbd.com
goodpainspills.com
hemendekor.com
kobisitecdn.com
makaigoods.com
mgs1454.com
piccadillyresidences.com
radiolaondafresca.com
rubendorf.com
searchengineimprov.com
sellmyhrvahome.com
shugahouseessentials.com
sonihullquad.com
subtractkilos.com
valeriekelmansky.com
vipasdigitalmarketing.com
voolivrerj.com
worldhealthstory.com
zeelonggroup.com
1015southrockhill.com
10x10b.com
111-online-casinos.com
191cb.com
3665arpentunitd.com
aitesonics.com
bag-shokunin.com
brightotech.com
communication-digitale-services.com
covoakland.org
dariaprimapack.com
freefortniteaccountss.com
gatebizglobal.com
global1entertainmentnews.com
greatytene.com
hiroshiwakita.com
iktodaypk.com
jahatsakong.com
meadowbrookgolfgroup.com
newsbharati.net
platinumstudiosdesign.com
slotxogamesplay.com
strikestaruk.com
techguroh.com
trucosdefortnite.com
ufabetrune.com
weddedtowhitmore.com
12940brycecanyonunitb.com
1311dietrichoaks.com
2monarchtraceunit303.com
601legendhill.com
850elaine.com
adieusolasomade.com
andora-ke.com
bestslotxogames.com
cannagomcallen.com
endlesslyhot.com
iestpjva.com
ouqprint.com
pwmaplefest.com
qtylmr.com
rb88betting.com
buscadogues.com
1007macfm.com
born-wild.com
growthinvests.com
promocode-casino.com
proyectogalgoargentina.com
wbthompson-art.com
whitemountainwheels.com
7thavehvl.com
developmethis.com
funkydogbowties.com
travelodgegrandjunction.com
gao-town.com
globalmarketsuite.com
blogshippo.com
hdbka.com
proboards67.com
outletonline-michaelkors.com
kalkis-research.com
thuthuatit.net
buckcash.com
hollistercanada.com
docterror.com
asadart.com
vmayke.org
erwincomputers.com
dirimart.org
okkii.com
loteriasdecehegin.com
mountanalog.com
healingtaobritain.com
ttxmonitor.com
nwordpress.com
11bolabonanza.com