Regression | Discriminant Analysis

Market research is often used to gain an understanding of the underlying drivers of a decision or outcome, and the characteristics that define a customer profile or segment (group). Regression and discriminant analysis are statistical techniques used for predicting which audience a customer falls into, which characteristics have the strongest impact on allocating a customer into a particular segment, and for predicting behaviors and preferences (such as which product an individual would be likely to buy).

To see how we can incorporate regression or discriminant analysis into your research

John, a marketing manager at a large multinational company, is interested in finding out what would most influence target customers to purchase his company’s products.

Paul, a sales director at a global manufacturing company, has conducted an evaluation of his market and has identified that there are four types of customer in terms of who they are, how they behave and what they need. Now he needs to translate this information into something actionable. How can his salesforce identify which segment each current and prospective customer falls into?

Regression and discriminant analysis will solve John and Paul’s problems. Both techniques predict outcomes and classify people or companies into different segments or categories, but are based on different statistical principles and assumptions.

Regression versus discriminant analysis

Linear regression – examines the relationship between an independent variable (e.g. length of relationship) and a dependent variable at the core of the research objective (e.g. overall satisfaction). Multivariate linear regression is used when there are multiple independent variables (e.g. length of relationship + product purchased + sales territory) in addition to the dependent variable. Linear regression is useful for identifying actions required to improve the outcome assessed with the dependent variable (in this case, overall satisfaction).
Logistic regression – examines the probability of an outcome based on the goodness of fit between various factors (independent variables) and the outcome (the dependent variable). Unlike linear regression where the dependent variable is typically measurable on a continuum (e.g. a customer satisfaction scale, a price range, etc.), the dependent variable in logistic regression is stochastic (i.e. randomly determined). Logistic regression is based on:
- A dichotomous response i.e. a binary dependent variable with only two possible outcomes, e.g. Yes/No on usage (this is known as binary logistic regression); or
- Multiple responses (more than two) which predict, for example, which segment a customer falls into or which product they intend to purchase (this is known as multinomial logistic regression

Discriminant analysis – determines the relationship between different independent variables and the dependent variable to predict an outcome. The dependent variable is categorical in nature, such as a segment, as opposed to a continuous variable as with linear regression. Analysis of the independent variables leads to the computation of coefficients (weighting factors) which are used to develop a decision rule at the heart of the model. Examples include an allocation algorithm for determining which segment an individual or company falls into, a tool for determining which product a company would purchase, etc.

Note that segmentation is arrived at through different statistical techniques such as cluster analysis. Logistic regression and discriminant analysis are thus used for predicting segment allocation only when the segmentation has been identified a priori.

In the case of a segmentation, the allocation algorithm consists of simple “killer” questions that could be applied passively (to a customer database) or actively (asked directly to someone) to allocate the individual or/and company to a specific group / segment. Similar tools can be created for other purposes beyond segmentation, such as for determining which new products customers can be expected to purchase.

Outputs from regression and discriminant analysis

Which factors differentiate one group from another, e.g. how John can determine which companies are most likely to purchase his products.
How to classify respondents into different groups e.g. how Paul can segment his customer base into different buyer and user types through an actionable tool.

The example below shows the impact of various factors on intent to purchase products sold by John’s company. Regression analysis was carried out for this goal and the figures shown are regression coefficients (weighting factors). The higher the regression coefficient, the stronger effect it has on intent to purchase. For example, if satisfaction with being an innovative company was increased by a factor of 1 (e.g. an improvement of 7 out of 10 to 8 out of 10 in satisfaction), then the likelihood to purchase would increase by a factor of 0.84. John and his company should therefore focus on innovation and complaint resolution as these will have the biggest impact on purchase intent.

The relative impact of various factors on purchase intent

regression & discriminant analysis

To see how we can help manage and aid your Discriminant analysis

Cookie	Duration	Description
__hssrc	session	This cookie is set by Hubspot whenever it changes the session cookie. The __hssrc cookie set to 1 indicates that the user has restarted the browser, and if the cookie does not exist, it is assumed to be a new session.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
__hssc	30 minutes	HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
__hstc	5 months 27 days	This is the main cookie set by Hubspot, for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_3031018_3	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
hubspotutk	5 months 27 days	HubSpot sets this cookie to keep track of the visitors to the website. This cookie is passed to HubSpot on form submission and used when deduplicating contacts.
undefined	never	Wistia sets this cookie to collect data on visitor interaction with the website's video-content, to make the website's video-content more relevant for the visitor.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.

Cookie	Duration	Description
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.

Cookie	Duration	Description
AnalyticsSyncHistory	1 month	No description
closest_office_location	1 month	No description
li_gc	2 years	No description
loglevel	never	No description available.
ssi--lastInteraction	10 minutes	This cookie is used for storing the date of last secure session the visitor had when visiting the site.
ssi--sessionId	1 year	This cookie is used for storing the session ID which helps in reusing the one the visitor had already used.
user_country	1 month	No description available.

Regression & Discriminant Analysis

Regression versus discriminant analysis

Outputs from regression and discriminant analysis