Medicinal Chemistry: Lead Discovery and Identification

By Published On: September 10, 2019Last Updated: November 23, 2022

Medicinal chemistry lays the foundation for the drug discovery process, with the first steps being lead discovery and identification. A strong lead compound increases the chances of a pharmaceutical company developing a strong drug candidate, with better chances of success later on. However, choosing a compound from a library of thousands can be tricky. We look at ways to identify initial compounds, how to compare their activity as well as the properties of promising drug candidates.

The first step of any drug discovery program starts with lead discovery and identification, where huge libraries of compounds are screened for their activity. Successful compounds are known as hits, which are further filtered out until the lead compound is identified.

Lead Discovery and Identification

What is a Lead?

At the core of every drug is its active ingredient, the magical molecule responsible for bringing about therapeutic effects. But the process of bringing such a molecule from its discovery to market is long, and littered with countless opportunities to fail completely.

All medicinal chemistry programs begin by selecting a physiological target, followed by screening compounds that show activity toward it. Promising compounds are known as leads. Leads are crucial in drug development because they afford flexibility to chop and change without too much consequence.

After the lead compound is modified and optimized, it becomes the unpolished diamond that is the drug candidate, ready for clinical trials.

Target Identification

Target identification is the initial step of the drug discovery process, in which a specific biological target is selected for which a drug can be designed. Since the body is made up of cells, drug targets are often components of cellular machinery. These include receptors that transmit biological signals, transporters and channels that move substances in and out of cells, or enzymes that catalyze specific chemical reactions inside the cell.

Targets are chosen based on the disease that they aim to treat, with the main factors being the demand for the drug and the likelihood of producing a successful drug. Because of this, novel drugs or those for rare and tropical diseases are rarely targeted, due to their risky returns on investments.

Drug targets must also be directly involved in a disease process, such that changing its activity is likely to produce a therapeutic effect.

Where to Look for Leads?

After a target is identified, the search begins for lead compounds, also known as ‘hits’, that show promising activity toward the target. Lead compounds can arise from natural sources, with complex therapeutic compounds the result of millions of years of evolution. Early drug discovery was oftentimes serendipitous, the result of pure luck.

Penicillin, for example, was discovered in 1928 by Scottish scientist Alexander Fleming after his petri dish of Staphylococcus was found to be contaminated by mold. It turned out that the mold was capable of producing an anti-bacterial compound that became the active ingredient of Penicillin.

The chemical synthesis of therapeutic compounds is a relatively recent endeavor, yet one that has yielded immense benefits to humankind. The ability to determine structure-activity relationships is key to rational drug design, in which the drug is tailor-made to fit its purpose, in terms of safety and efficacy.

Previously discovered or synthesized compounds are stored in ‘libraries’, actual databases with known structural and chemical information attached. As computer programs cannot accurately predict the effects of a compound on a specific target, the creation of libraries often involves a great deal of laboratory work. This includes synthesis, extraction and purification of compounds, followed by tests to confirm their properties.

How to Identify a Lead Compound

High Throughput Screening (HTS)

After a target is identified, high throughput screening (HTS) is used to identify compounds that show activity toward the target. It is ‘high throughput’ because a large library of compounds is screened, producing a few lead compounds or ‘hits’. For example, if the biological target is a certain receptor on cells, HTS can test a library of compounds using established quantitative methods to screen their affinities (binding) to that receptor.

high throughput screen machine
A high throughput screen machine in action

An accurate and effective way to measure signals (or response) is therefore key in the discovery of lead compounds. One common method is by using competition binding assays, in which a radioactive ligand is first introduced, binding to the target receptor.

The screened compound is then added, which displaces the bound radioactive ligands from the target. The change in signal as the radioactive ligands are freed is measured. As compound concentration increases, all the bound radioactive ligands on the target receptor will eventually be displaced i.e. the maximum response.

Quantifying Activity with IC50

When screening thousands of compounds, it is important to be able to quickly quantify and compare the activity between them. Generally, an increase in the concentration of a compound leads to an increase in the target response.

Once the maximum response is reached (won’t go any higher), the concentration that induces half the maximum effect is measured – this is its half-maximal inhibitory concentration (IC50) value. A smaller IC50 means a more potent compound, requiring a lower concentration to produce the same effect.

IC50 potency graph chart
The IC50 can be easily visualized by using concentration-response curves. Here Compound X is deemed more potent than Compound Y, requiring a lower concentration to elicit the same response.

Quantifying Efficacy with EC50 and Percent Efficacy

Often, different compounds can induce a different maximum response, which makes IC50 values irrelevant. We can set the maximal response of a target bound to its ‘natural’ ligand (under physiological conditions) to have a 100 percent efficacy.

The compounds screened can then be assigned percent efficacy values based on their maximal response. The half-maximal effective concentration (EC50) is then the concentration of a compound that induces half of its individual maximal response. Both the EC50 as well as the percent efficacy have to be taken into account.

EC50 percent efficacy graph chart
By setting the maximum response of the natural ligand at 100%, we can say that Compound X has a lower EC50 but ~80% maximum percent efficacy.

Ligand Efficiency

Another method to compare activity between compounds is by using its IC50 value in a ligand efficiency (LE) equation:

ligand efficiency

The ligand efficiency is the potency value divided by N, the number of nonhydrogen atoms in the ligand. If two different ligands have a similar IC50 value, the smaller one with fewer atoms will have a higher ligand efficiency.

In general, small molecules (with higher ligand efficiency) are preferred for lead compounds. From a green chemistry perspective, a smaller atom also means better atom economy and less generated waste.

Further ‘Drug-Like’ Studies

The data from high-throughput screening provides us with a pool of compounds that show activity to a specific target. However, having high activity doesn’t necessarily mean that a compound will be a good drug candidate. We must ensure that the lead compounds we choose can perform under physiological conditions. There are certain characteristic ‘drug-like’ features that give certain compounds an edge over others, which we will discuss here.

Ease of Chemical Transformations

Take a look at the pool of compounds that make up the current drug landscape and it is clear that several recurring features appear. Although synthetic organic chemistry is able to create an almost limitless array of different molecules, the medicinal chemistry pool is tiny!

This is because of the limited synthetic techniques that are available to medicinal chemists, owing to the safety and yield concerns of many reactions. In a study of over 2000 drugs, it was found their structures could be described by just 32 chemical frameworks1.

Palladium-catalyzed cross-coupling reactions are important in biomedical research.

Furthermore, certain chemical functional groups or structures exist that tend to be avoided in medicinal chemistry, due to their inherent safety concerns in vivo. They are known collectively as toxicophoric groups and are usually electron-withdrawing electrophiles (Michael acceptors, epoxides), capable of alkylating biological nucleophiles such as the amino acid cysteine.

Others, like furan and aniline rings, are also avoided, as they can be metabolically activated by CYP450 enzymes into electrophilic derivatives.

Solubility and Administration

There are a variety of ways to administer a drug so that it reaches its intended location in the body. After all, the compound can’t exert its therapeutic effect unless it finds its target. Routes of administration include inhalation, injections, topical applications, but the most common route is through oral administration.

Drugs in the form of liquids or tablets are the most convenient, as they can be taken by the patient without medical supervision.

For all routes of administration, the solubility of a compound in the body is important. Most of the time, we require the drug to dissolve in the bloodstream, which is a polar aqueous environment.

However, we also need the compound to cross the non-polar lipid cell membranes to interact with the cell machinery. Therefore a balance between polar and non-polar solubility is important. The lipophilicity (log P) value is a measure of its relative solubility, measured by mixing the compound in equal parts 1-octanol and water.

log P value solubility in octanol and water

A higher log P value, therefore, indicates lipophilic (non-polar) character. A low log P, on the other hand, indicates hydrophilic (polar) character. As a general rule of thumb, a log P value between 1 and 3 is desired for good bioavailability. This can be modified later by inserting polar or non-polar functional groups to fine-tune solubility.

Lipinski’s Rule (Rule of 5)

Lipinski’s rule of 5 provides us with a rule of thumb to predict whether a drug will be orally active. An orally active drug should not violate more than one of the following criteria:

  1. Have a molecular weight below 500
  2. With a lipophilicity value (log P) of less than 5
  3. Have no more than 5 hydrogen bond donors
  4. Have no more than 10 hydrogen bond acceptors

The rules above are empirical rules, formulated based on successful drugs. 94% of oral drugs fall within Lipinski’s space2. However, lead compounds generally follow a stricter ruleset (MW <300, log P < 3, H bond donors/acceptors < 3), leaving some room for further modification and optimization.

Medicinal Chemistry: Modification and Optimization

Just identifying ‘hits’ isn’t all there is to drug discovery though! Clinical trials are extremely expensive, therefore we must first attempt to create the best version of our compound for use as the drug candidate. The next step in medicinal chemistry is therefore to modify and optimize the lead compound at the molecular level. Synthetic organic chemistry techniques enable us to improve the compound’s pharmacokinetic and pharmacodynamic properties, resulting in a better drug candidate with a greater chance of success in the next stage of drug development.


  1. Boström, J., Brown, D. G., Young, R. J., & Keserü, G. M. (2018). Expanding the medicinal chemistry synthetic toolbox. Nature Reviews Drug Discovery17(10), 709-727.
  2. DeGoey, D. A., Chen, H. J., Cox, P. B., & Wendt, M. D. (2017). Beyond the Rule of 5: Lessons Learned from AbbVie?s Drugs and Compound Collection: Miniperspective. Journal of Medicinal Chemistry61(7), 2636-2651.

About the Author

sean author
Sean Lim

Sean is a consultant for clients in the pharmaceutical industry and is a lecturer at a local university, where unfortunate undergrads are subject to his ramblings on chemistry and pharmacology.

You Might Also Like…

Go to Top