import nltk
from pprint import pprint
import requests
from bs4 import BeautifulSoup
import re
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

url='https://www.linkedin.com/jobs/view/3422401649/?eBP=CwEAAAGIKwjYj_vQEnYbvBBn2V6wiKWn9ClN6m9vyujQQerrMecjbzIiAGBMoQtO075kD9qxVfPYdICvZOFJH8TgWvs_pyK5rK3IxlOdjyOkgdw6i6Ijaq5kHN9XjUlxkeIkorciw9wzIbpf7MDORrpc_oIbDCGYzj_jqY5BqT-AXYpZOdqEBqUyUzGwPKOgpvf9wZIoWneDJayglEX89z3v2ubRTxuRPdvUonTrm0y7Zg-5HJvy7M-wCjHIBolo8FSjl2TrpUqlE0XhZllwshCwuMrZfT8zhxfFoVQFd_3oucfQ8m3mWRZo_5beZ7eqckh17GvZYyuV5KBgKzjlrNLV6m1HJyUFHbTbixkdna3Nw4fG5TrALIbeQo8XtMyPE4TaIwC6TDr3RUqJ&recommendedFlavor=IN_NETWORK&refId=2oxC6zYa%2Fj%2F0ApGKWLNEbA%3D%3D&trackingId=1cPJJhWY2gdGWL9BkbjRig%3D%3D&trk=flagship3_search_srp_jobs';
response=requests.get(url);
print(response);
soup=BeautifulSoup(response.content,'html.parser');

<Response [200]>

d=soup.find('div',class_="description__text description__text--rich");

CLEANR=re.compile('<.*?>|&([a-z0-9]+|#[0-9]{1,6}|#x[0-9a-f]{1,6});');

def cleanhtml(raw_html):
  cleantext = re.sub(CLEANR,'', raw_html);
  return cleantext;

text=cleanhtml(str(d));
text

'\n\n\n        Job Id: 22598145The Data Science Lead Analyst is a strategic professional who stays abreast of developments within own field and contributes to directional strategy by considering their application in own job and the business. Recognized technical authority for an area within the business. Requires basic commercial awareness. There are typically multiple people within the business that provide the same level of subject matter expertise. Developed communication and diplomacy skills are required in order to guide, influence and convince others, in particular colleagues in other areas and occasional external customers. Significant impact on the area through complex deliverables. Provides advice and counsel related to the technology or operations of the business. Work impacts an entire area, which eventually affects the overall performance and effectiveness of the sub-function/job family.Responsibilities:Conducts strategic data analysis, identifies insights and implications and make strategic recommendations, develops data displays that clearly communicate complex analysis.Mines and analyzes data from various banking platforms to drive optimization and improve data quality.Deliver analytics initiatives to address business problems with the ability to determine data required, assess time  effort required and establish a project plan.Consults with business clients to identify system functional specifications. Applies comprehensive understanding of how multiple areas collectively integrate to contribute towards achieving business goals.Consults with users and clients to solve complex system issues/problems through in-depth evaluation of business processes, systems and industry standards; recommends solutions.Leads system change process from requirements through implementation; provides user and operational support of application to business users.Formulates and defines systems scope and goals for complex projects through research and fact-finding combined with an understanding of applicable business systems and industry standards.Impacts the business directly by ensuring the quality of work provided by self and others; impacts own team and closely related work teams.Considers the business implications of the application of technology to the current business environment; identifies and communicates risks and impacts.Drives communication between business leaders and IT; exhibits sound and comprehensive communication and diplomacy skills to exchange complex information.Conduct workflow analysis, business process modeling; develop use cases, test plans, and business rules; assist in user acceptance testing.Collaborate on design and implementation of workflow solutions that provide long term scalability, reliability, and performance, and integration with reporting.Develop in-depth knowledge and proficiency of supported business areas and engage business partners in evaluating opportunities for process integration and refinement.Gather requirements and provide solutions across Business SectorsPartner with cross functional teams to analyze, deconstruct, and map current state process and identify improvement opportunities including creation of target operation models.Assist in negotiating for resources owned by other areas in order ensure required work is completed on scheduleDevelop and maintain documentation on an ongoing basis, and train new and existing usersDirect the communication of status, issue, and risk disposition to all stakeholders, including Senior ManagementDirect the identification of risks which impact project delivery and ensure mitigation strategies are developed and executed when necessaryEnsure that work flow business case / cost benefit analyses are in line with business objectivesDeliver coherent and concise communications detailing the scope, progress and results of initiatives underwayDevelop strategies to reduce costs, manage risk, and enhance servicesDeploy influencing and matrix management skills in order to ensure technology solutions meet business requirementsPerforms other duties and functions as assigned.Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm\'s reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency.Qualifications:MBA or Advanced Degree Information Systems, Business Analysis / Computer Science6-10 years experience using tools for statistical modeling of large data setsProcess Improvement or Project Management experienceEducation:Bachelor’s/University degree or equivalent experience, potentially Masters degreeThis job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.-------------------------------------------------Job Family Group: Technology-------------------------------------------------Job Family:Data Science------------------------------------------------------Time Type:Full time------------------------------------------------------Primary Location:Irving Texas United States------------------------------------------------------Primary Location Salary Range:$121,560.00 - $182,340.00------------------------------------------------------Citi is an equal opportunity and affirmative action employer.Qualified applicants will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.Citigroup Inc. and its subsidiaries ("Citi”) invite all qualified interested applicants to apply for career opportunities. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.View the "EEO is the Law" poster. View the EEO is the Law Supplement.View the EEO Policy Statement.View the Pay Transparency Posting\n\n\n        Show more\n\n        \n\n\n        Show less\n\n        \n\n\n'

tokens=nltk.word_tokenize(text);
# display the first 15 tokens
tokens[0:15]

['Job',
 'Id',
 ':',
 '22598145The',
 'Data',
 'Science',
 'Lead',
 'Analyst',
 'is',
 'a',
 'strategic',
 'professional',
 'who',
 'stays',
 'abreast']

# Stop words: transition words and special characters
Additional_Stopwords=['-','\n','--','-Job','Id','.','``',"''",')','(',',',':',';','’','/'];
stopwords=nltk.corpus.stopwords.words("english")+Additional_Stopwords;

# Removes the Stop words from text
ModifiedText=[w for w in tokens if w.lower() not in stopwords];

# Display the original character count with the new text count
print("Text: " + str(len(text)));
print("Text - stopwords removed: " + str(len(ModifiedText)));

Text: 6212
Text - stopwords removed: 538

fd=nltk.FreqDist(ModifiedText);
content_words=fd.most_common();
pprint(content_words[0:10]);

[('business', 23),
 ('data', 6),
 ('complex', 5),
 ('work', 5),
 ('communication', 4),
 ('required', 4),
 ('areas', 4),
 ('process', 4),
 ('strategic', 3),
 ('within', 3)]

Date=['Two weeks ago','One week ago','One day ago','One week ago','Three weeks ago','Two days ago','One Month'
     ,'Three days ago','Three days ago','Three days ago','One month ago','One week ago','Five days ago'
     ,'Six days ago','One months ago','Two weeks ago','One week ago','One week ago','One month ago'
      ,'One month ago']; 
Company=['Citi','Charles Schwab','Strker','AEGIS Hedging','Aditi Consulting','Gartner','Concurrency Inc'
        ,'Quantlab Group','Tata Consultancy Services','Google','Balyasny Asset Management L.P.'
        ,'Biamp','Amherst','Texas Capital Bank','Abbott','Medpace','Strategic Staffing Solutions'
         ,'SoFi','incedo','AE Studio']; 
Industry=['Financial Services','Financial Services','Medical Equipment Manufacturing','Financial Services'
         ,'IT Services and IT Consulting','Information','IT Services and IT Consulting','Financial Services'
         ,'IT Services and IT Consulting','Technology, Information and Internet','Investment Management'
         ,'Appliances, Electrical, and Electronics Manufacturing','Investment Management','Banking'
          ,'Hospitals and Healthcare','Pharmaceutical Manufacturing','IT Services and IT Consulting'
         ,'Financial Services','Information Technology & Services','Software Development']; 
Position=['Data Scientist','Data Scientist','Data Scientist - Sales Operations (Remote)','Python/R Quantitative Developer'
         ,'Machine Learning Engineer','Data Scientist','Data Scientist','Quantitative Developer','Data Scientist'
         ,'AI Consultant, Google Cloud','Investment Data Analyst - Equities','AI Opportunities Analyst',
         'Financial Data Scientist','Data Analyst','Data Scientist','Statistical Analyst - Experienced'
         ,'Data Scientist (Remote)','Staff Data Scientist - Machine Learning','Machine Learning Engineer'
         ,'Data Scientist']; 
Experience=['Senior','Entry','Associate','Senior','Mid-Senior','Associate','Entry','Associate','Mid-Senior'
            ,'Senior','Associate','Entry','Mid-Senior','Mid-Senior','Associate','Mid-Senior',
           'Entry','Senior','Mid-Senior','Entry'];
Employment=['Full-Time','Full-Time','Full-Time','Full-Time','Contract','Full-Time','Full-Time','Full-Time'
            ,'Full-Time','Full-Time','Full-Time','Full-Time','Full-Time','Full-Time','Full-Time'
            ,'Full-Time','Contract','Full-Time','Full-Time','Full-Time'];
Salary=['121,256/yr-182,340/yr','81,800/yr-167,200/yr','83,000/yr-176,800','Not-posted','115,200/yr-134,400/yr'
       ,'Not-posted','Not-posted','146,000/yr-194,000/yr','Not-posted','215,000/yr-315,000/yr','Not-posted'
       ,'Not-posted','Not-post','Not-posted','Not-posted','Not-posted','Not-posted','162,000/yr-247,500/yr'
       ,'Not-posted','120000/yr-220000/yr'];
Job_description=[];
Job_description.append(text);
Content_words=[];
Content_words.append(content_words);
Response=[' '];
Website=['https://jobs.citi.com/job/-/-/287/42525672736?source=LinkedInJB&dclid=CJKQydmBh_8CFc3HGAIdw90DJA'
         ,'https://www.linkedin.com/jobs/view/3613370923/?alternateChannel=search&refId=3JbP0Tt1XIXpTWTT0lD8gw%3D%3D&trackingId=%2BpfA1xUVxZ1eWp3i1tX3Lg%3D%3D&trk=d_flagship3_search_srp_jobs'
         ,'https://www.linkedin.com/jobs/view/3613108464/?alternateChannel=search&refId=%2FkWkKxTFjB3FfdAProdkbQ%3D%3D&trackingId=nA%2BLlIWXwZiwfKoIPJNE2w%3D%3D&trk=d_flagship3_search_srp_jobs'
         ,'https://www.linkedin.com/jobs/view/3599977799/?eBP=CwEAAAGIP8g534pplxr1vYZ_6J-B0u9Xl4s3mN-A8kzMYJyvUcZ5adabeKC0Cuf0_4zpcyXokuzF44-TuDZQ_Vq8fhsCGQ2G4_P9bHPRHO8UulZlQuLnrp_JxcdTyCQ29K_nwTai-s9xgCb3rqwt4XPJ3Cy4dIRw7ZlD0nDatE6yzphRJ4elK9qjJTth5YUXt2YTcxUvITdvrPvAwCfBX7RqV5rCs7T1NuEMejJNP0QrrAUMYs94Eujwrn6JViC3Elrh8Ig4-qybRymK_E7v-YDAabbISh0C_-_PLSfZMeF7dfHvqPNqdNblLm_-GY4d-QUieZtY0IAKP0s1b5au1qWXFPgNj8C-zKF-QS-t9fv84rGbt00yhlFGqoWKdpE2crvavQKX1SY&recommendedFlavor=SCHOOL_RECRUIT&refId=%2FkWkKxTFjB3FfdAProdkbQ%3D%3D&trackingId=%2FPLKxHVvQFxnGJi8r4nPgA%3D%3D&trk=flagship3_search_srp_jobs'
         ,'https://www.linkedin.com/jobs/view/3582138574/?alternateChannel=search&refId=3JbP0Tt1XIXpTWTT0lD8gw%3D%3D&trackingId=J7Cx84BH16Eyl%2BQRVSskfg%3D%3D&trk=d_flagship3_search_srp_jobs'
         ,'https://www.linkedin.com/jobs/view/3477165235/?eBP=CwEAAAGIP8g0KXzFEofkEzElfx54RNh0nOlnShDjndi4QqQOkqwwN8hbMGVioOW7NHTak_v6IDEWwPDc4oEoPd2r4twihcCYJnGi3g9bByLR2A8ecEjIGinHlUd4WSI-5-1rmOm61lFsuU63LwxjWCXO-IsZCyv-daOqM6DMYLxlcE8kkqoJqgy877PmFH5ojerEelowOFmdWxXKbfngAiLZNaahF4rfXeVuvmg0wszA1hS89nxMiNoepeoqRJkm4-tomwXpTD8tkUSOyeKNhKH0gbOJ1sITrsiOw2-J9r3ZJejufgIV63i44tTn9rSfku6dX3-bqwkDXy_sziqE63jd7Nq7irvezhVyUUJFrSQasT-mUohTDfOFT8Z4VSeCQycvYWep3Bk&recommendedFlavor=COMPANY_RECRUIT&refId=3JbP0Tt1XIXpTWTT0lD8gw%3D%3D&trackingId=TqZ7Qpvo3OX1rKQsbE4%2Fnw%3D%3D&trk=flagship3_search_srp_jobs'
         ,'https://www.linkedin.com/jobs/view/3576885404/?alternateChannel=search&refId=%2FkWkKxTFjB3FfdAProdkbQ%3D%3D&trackingId=pYdhv1%2FLXu%2BXatill6KqwA%3D%3D&trk=d_flagship3_search_srp_jobs'
         ,'https://www.linkedin.com/jobs/view/3533888707/?eBP=CwEAAAGIQHmkElZsjKXsgv7Uio5Yz3seGaXbtH0VJedWjQX2hiNSv84sJRpLrBUCVZJ0DJO3Aj7p_Ie8gk7dgRjTFhfTZP6W7wAkKatNrqi33SOxMlQxcp5f0XMLFo_J12KreXV8rlQMV6o_6pGkvHYezURpb5zWmewq9PqR68DWQ57rSUcnvtv9TB-xBNO1DRYmKaF10dCeIhojY4BhwB2b6X5KpO1dypD7E-iRJlyPG80XhhpZe_w_XGdbuclMgTclU1f0Fy0vS5vzzFnV31PErrat5Q4k6CqNGSA6m_w7FbkOikrf60Dd4_FWJaW2TtVzeIwUX6NuygD4lxmBn74iwACJLvyhhNBJ633_T6cYk_LA3uWEImM4T99JWYFxHXBAd7N3_jpjew&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=UxEkMI4vaGKfDjn9ytiG0A%3D%3D&trk=flagship3_search_srp_jobs'
         ,'https://www.linkedin.com/jobs/view/3606141065/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=H7TRoaQDIoi3GV9cOP1Ngw%3D%3D&trk=d_flagship3_postapply_demographics'
         ,'https://www.linkedin.com/jobs/view/3583462407/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=1jewIfuP8C7DJJUywH8Khw%3D%3D&trk=d_flagship3_search_srp_jobs'
         ,'https://www.linkedin.com/jobs/view/3535282529/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=HbWwvjEbZbKIPFSxxQ1mDw%3D%3D&trk=d_flagship3_search_srp_jobs'
         ,'https://www.linkedin.com/jobs/view/3580011415/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=iDN25XLEt7TXdPZ%2Fh7J4VA%3D%3D&trk=d_flagship3_search_srp_jobs'
         ,'https://www.linkedin.com/jobs/view/3575142950/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=mX30J8rjqeH00qDJfKLu6A%3D%3D&trk=d_flagship3_search_srp_jobs'
        ,'https://www.linkedin.com/jobs/view/3603254581/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=55VlMw5TbhgR7BOJc9wz4Q%3D%3D&trk=d_flagship3_search_srp_jobs'
        ,'https://www.linkedin.com/jobs/view/3607337417/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=MskjAWWvcsLFt5yiDfYdfQ%3D%3D&trk=d_flagship3_search_srp_jobs'
        ,'https://www.linkedin.com/jobs/view/3595564767/?alternateChannel=search&refId=YlVwMFTzTZ%2BK4mzVOGQVMw%3D%3D&trackingId=8%2B2pgOHhQrW%2B8KImLsU%2FSA%3D%3D&trk=d_flagship3_search_srp_jobs'
        ,'https://www.linkedin.com/jobs/view/3601784881/?alternateChannel=search&refId=YlVwMFTzTZ%2BK4mzVOGQVMw%3D%3D&trackingId=iWmX3DowrpGGxKspMX8Czg%3D%3D&trk=d_flagship3_search_srp_jobs'
        ,'https://www.linkedin.com/jobs/view/3602626868/?alternateChannel=search&refId=YlVwMFTzTZ%2BK4mzVOGQVMw%3D%3D&trackingId=PaIUJiUbsYox8Qyh8sNowg%3D%3D&trk=d_flagship3_search_srp_jobs'
        ,'https://www.linkedin.com/jobs/view/3568734219/?eBP=CwEAAAGIQHmtM7hiePfMP6z0GXdQ9JFGDpC8Arz8Rehh-uUKB0hOyxYFJxtvNWf7UiU-fqt21o6NnB_186LSAWNNHMQxrlMwLCubP-5FeseBtYvZ1BWK3d9b56uururxV8OpZcgBKz0InlxPlVxGEb-xe5B3Ak8z1sYZUhwUpQ-z6z92beMRnxTkFwKh_BQHWhmu6QRmmCIQDZLw0GW52mihT3tQqAXYJWACOVhFl3EQRFCceQ2B3GQR4SlWurvxDe9SEo2ZAUo8CeWXmzyWY5n_z0J8Nmh0Yti-B0GyDt2tMJtt6IEM5zAwHLGfgxgdnNFRye-Lwge7zVzjzAhMBKDtwFKjwvrDf5L8Db15KS-lBFNsfZvfDlDmDjRgAhSrPszGsX6uqw5spg&recommendedFlavor=SCHOOL_RECRUIT&refId=YlVwMFTzTZ%2BK4mzVOGQVMw%3D%3D&trackingId=Yd%2FKGb8lRSDuZ9VLo%2B5jow%3D%3D&trk=flagship3_search_srp_jobs'
        ,'https://www.linkedin.com/jobs/view/3586405131/?eBP=JOB_SEARCH_ORGANIC&refId=YlVwMFTzTZ%2BK4mzVOGQVMw%3D%3D&trackingId=cAIjHogg8mgMWkONgwQP8g%3D%3D&trk=flagship3_search_srp_jobs'];

def Job_Data(url,Job_description,Content_words,Response):
    response=requests.get(str(url));
    web=BeautifulSoup(response.content,'html.parser');
    temp=web.find('div',class_="description__text description__text--rich");
    temp_text=cleanhtml(str(temp));
    Job_description.append(temp_text);
    tokens=nltk.word_tokenize(temp_text)
    ModifiedText=[w for w in tokens if w.lower() not in stopwords];
    fd=nltk.FreqDist(ModifiedText);
    content_words=fd.most_common();
    Content_words.append(content_words);
    Response=Response.append(' ');
    pass;

file=open("LinkedIn_urls_Full.txt","r");
for links in file.readlines():
    Job_Data(links,Job_description,Content_words,Response);

dataframe=pd.DataFrame({"Date":Date, "Company":Company, "Industry":Industry, "Position":Position
                        , "Experience":Experience, "Employment":Employment, "Salary":Salary
                        , "Job_description":Job_description, "Content_words":Content_words
                        , "Website":Website});
dataframe

print("Date:  " + str(len(Date)))
print("Company:  " + str(len(Company)))
print("Industry:  " + str(len(Industry)))
print("Position:  " + str(len(Position)))
print("Experience:  " + str(len(Experience)))
print("Employment:  " + str(len(Employment)))
print("Salary:  " + str(len(Salary)))
print("Job_description:  " + str(len(Job_description)))
print("Content_words:  " + str(len(Content_words)))
print("Response:  " + str(len(Response)))
print("Website:  " + str(len(Website)))

Date:  20
Company:  20
Industry:  20
Position:  20
Experience:  20
Employment:  20
Salary:  20
Job_description:  20
Content_words:  20
Response:  20
Website:  20

# Easy access to the LinkedIn urls
file=open("LinkedIn_urls_Full.txt","r");
for links in file.readlines():
    print(links)

https://www.linkedin.com/jobs/view/3613370923/?alternateChannel=search&refId=3JbP0Tt1XIXpTWTT0lD8gw%3D%3D&trackingId=%2BpfA1xUVxZ1eWp3i1tX3Lg%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3613108464/?alternateChannel=search&refId=%2FkWkKxTFjB3FfdAProdkbQ%3D%3D&trackingId=nA%2BLlIWXwZiwfKoIPJNE2w%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3599977799/?eBP=CwEAAAGIP8g534pplxr1vYZ_6J-B0u9Xl4s3mN-A8kzMYJyvUcZ5adabeKC0Cuf0_4zpcyXokuzF44-TuDZQ_Vq8fhsCGQ2G4_P9bHPRHO8UulZlQuLnrp_JxcdTyCQ29K_nwTai-s9xgCb3rqwt4XPJ3Cy4dIRw7ZlD0nDatE6yzphRJ4elK9qjJTth5YUXt2YTcxUvITdvrPvAwCfBX7RqV5rCs7T1NuEMejJNP0QrrAUMYs94Eujwrn6JViC3Elrh8Ig4-qybRymK_E7v-YDAabbISh0C_-_PLSfZMeF7dfHvqPNqdNblLm_-GY4d-QUieZtY0IAKP0s1b5au1qWXFPgNj8C-zKF-QS-t9fv84rGbt00yhlFGqoWKdpE2crvavQKX1SY&recommendedFlavor=SCHOOL_RECRUIT&refId=%2FkWkKxTFjB3FfdAProdkbQ%3D%3D&trackingId=%2FPLKxHVvQFxnGJi8r4nPgA%3D%3D&trk=flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3582138574/?alternateChannel=search&refId=3JbP0Tt1XIXpTWTT0lD8gw%3D%3D&trackingId=J7Cx84BH16Eyl%2BQRVSskfg%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3477165235/?eBP=CwEAAAGIP8g0KXzFEofkEzElfx54RNh0nOlnShDjndi4QqQOkqwwN8hbMGVioOW7NHTak_v6IDEWwPDc4oEoPd2r4twihcCYJnGi3g9bByLR2A8ecEjIGinHlUd4WSI-5-1rmOm61lFsuU63LwxjWCXO-IsZCyv-daOqM6DMYLxlcE8kkqoJqgy877PmFH5ojerEelowOFmdWxXKbfngAiLZNaahF4rfXeVuvmg0wszA1hS89nxMiNoepeoqRJkm4-tomwXpTD8tkUSOyeKNhKH0gbOJ1sITrsiOw2-J9r3ZJejufgIV63i44tTn9rSfku6dX3-bqwkDXy_sziqE63jd7Nq7irvezhVyUUJFrSQasT-mUohTDfOFT8Z4VSeCQycvYWep3Bk&recommendedFlavor=COMPANY_RECRUIT&refId=3JbP0Tt1XIXpTWTT0lD8gw%3D%3D&trackingId=TqZ7Qpvo3OX1rKQsbE4%2Fnw%3D%3D&trk=flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3576885404/?alternateChannel=search&refId=%2FkWkKxTFjB3FfdAProdkbQ%3D%3D&trackingId=pYdhv1%2FLXu%2BXatill6KqwA%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3533888707/?eBP=CwEAAAGIQHmkElZsjKXsgv7Uio5Yz3seGaXbtH0VJedWjQX2hiNSv84sJRpLrBUCVZJ0DJO3Aj7p_Ie8gk7dgRjTFhfTZP6W7wAkKatNrqi33SOxMlQxcp5f0XMLFo_J12KreXV8rlQMV6o_6pGkvHYezURpb5zWmewq9PqR68DWQ57rSUcnvtv9TB-xBNO1DRYmKaF10dCeIhojY4BhwB2b6X5KpO1dypD7E-iRJlyPG80XhhpZe_w_XGdbuclMgTclU1f0Fy0vS5vzzFnV31PErrat5Q4k6CqNGSA6m_w7FbkOikrf60Dd4_FWJaW2TtVzeIwUX6NuygD4lxmBn74iwACJLvyhhNBJ633_T6cYk_LA3uWEImM4T99JWYFxHXBAd7N3_jpjew&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=UxEkMI4vaGKfDjn9ytiG0A%3D%3D&trk=flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3606141065/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=H7TRoaQDIoi3GV9cOP1Ngw%3D%3D&trk=d_flagship3_postapply_demographics

https://www.linkedin.com/jobs/view/3583462407/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=1jewIfuP8C7DJJUywH8Khw%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3535282529/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=HbWwvjEbZbKIPFSxxQ1mDw%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3580011415/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=iDN25XLEt7TXdPZ%2Fh7J4VA%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3575142950/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=mX30J8rjqeH00qDJfKLu6A%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3603254581/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=55VlMw5TbhgR7BOJc9wz4Q%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3607337417/?alternateChannel=search&refId=2lqK2EIwoP%2FvzvGr0JOMFg%3D%3D&trackingId=MskjAWWvcsLFt5yiDfYdfQ%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3595564767/?alternateChannel=search&refId=YlVwMFTzTZ%2BK4mzVOGQVMw%3D%3D&trackingId=8%2B2pgOHhQrW%2B8KImLsU%2FSA%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3601784881/?alternateChannel=search&refId=YlVwMFTzTZ%2BK4mzVOGQVMw%3D%3D&trackingId=iWmX3DowrpGGxKspMX8Czg%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3602626868/?alternateChannel=search&refId=YlVwMFTzTZ%2BK4mzVOGQVMw%3D%3D&trackingId=PaIUJiUbsYox8Qyh8sNowg%3D%3D&trk=d_flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3568734219/?eBP=CwEAAAGIQHmtM7hiePfMP6z0GXdQ9JFGDpC8Arz8Rehh-uUKB0hOyxYFJxtvNWf7UiU-fqt21o6NnB_186LSAWNNHMQxrlMwLCubP-5FeseBtYvZ1BWK3d9b56uururxV8OpZcgBKz0InlxPlVxGEb-xe5B3Ak8z1sYZUhwUpQ-z6z92beMRnxTkFwKh_BQHWhmu6QRmmCIQDZLw0GW52mihT3tQqAXYJWACOVhFl3EQRFCceQ2B3GQR4SlWurvxDe9SEo2ZAUo8CeWXmzyWY5n_z0J8Nmh0Yti-B0GyDt2tMJtt6IEM5zAwHLGfgxgdnNFRye-Lwge7zVzjzAhMBKDtwFKjwvrDf5L8Db15KS-lBFNsfZvfDlDmDjRgAhSrPszGsX6uqw5spg&recommendedFlavor=SCHOOL_RECRUIT&refId=YlVwMFTzTZ%2BK4mzVOGQVMw%3D%3D&trackingId=Yd%2FKGb8lRSDuZ9VLo%2B5jow%3D%3D&trk=flagship3_search_srp_jobs

https://www.linkedin.com/jobs/view/3586405131/?eBP=JOB_SEARCH_ORGANIC&refId=YlVwMFTzTZ%2BK4mzVOGQVMw%3D%3D&trackingId=cAIjHogg8mgMWkONgwQP8g%3D%3D&trk=flagship3_search_srp_jobs

# Top twenty words for Citi - Data Scientist
print(Content_words[0][0:20]);

[('business', 23), ('data', 6), ('complex', 5), ('work', 5), ('communication', 4), ('required', 4), ('areas', 4), ('process', 4), ('strategic', 3), ('within', 3), ('application', 3), ('area', 3), ('provide', 3), ('skills', 3), ('order', 3), ('technology', 3), ('clients', 3), ('system', 3), ('systems', 3), ('solutions', 3)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[0][0:20])):
    unique_word.append(Content_words[0][i][0]);
    word_count.append(Content_words[0][i][1]);

plt.bar(unique_word,word_count);
plt.title('Citi - Data Scientist - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45, ha='right');
plt.show();

Cwords=[];
for i in range(len(Content_words[1][:])):
    Cwords.append(Content_words[1][i][0]);

Contained=[];
Technicals=['python','sql','nosql','excel','learn','machine','algorithms'];
contents=[];
for i in Cwords:
    if i.lower() in Technicals:
        contents.append(True);
        Contained.append(i);
    else:
        contents.append(False);
        
print("Technical words contained inside post: " + str(sum(contents)));
print(" ")
print("Technical words: ");
print(Contained);

Technical words contained inside post: 4
 
Technical words: 
['algorithms', 'Python', 'machine', 'SQL']

print(Content_words[0]);

[('business', 23), ('data', 6), ('complex', 5), ('work', 5), ('communication', 4), ('required', 4), ('areas', 4), ('process', 4), ('strategic', 3), ('within', 3), ('application', 3), ('area', 3), ('provide', 3), ('skills', 3), ('order', 3), ('technology', 3), ('clients', 3), ('system', 3), ('systems', 3), ('solutions', 3), ('opportunities', 3), ('ensure', 3), ('risk', 3), ('EEO', 3), ('Data', 2), ('Science', 2), ('job', 2), ('multiple', 2), ('diplomacy', 2), ('others', 2), ('particular', 2), ('impact', 2), ('related', 2), ('impacts', 2), ('performance', 2), ('analysis', 2), ('identifies', 2), ('implications', 2), ('initiatives', 2), ('assess', 2), ('time', 2), ('project', 2), ('identify', 2), ('functional', 2), ('comprehensive', 2), ('understanding', 2), ('in-depth', 2), ('industry', 2), ('requirements', 2), ('implementation', 2), ('provides', 2), ('user', 2), ('scope', 2), ('applicable', 2), ('current', 2), ('risks', 2), ('sound', 2), ('workflow', 2), ('modeling', 2), ('use', 2), ('rules', 2), ('integration', 2), ('Business', 2), ('including', 2), ('status', 2), ('strategies', 2), ('duties', 2), ('consideration', 2), ('Policy', 2), ('experience', 2), ('tools', 2), ('review', 2), ('-Job', 2), ('Family', 2), ('Primary', 2), ('Location', 2), ('$', 2), ('Citi', 2), ('opportunity', 2), ('applicants', 2), ('disability', 2), ('apply', 2), ('career', 2), ('Law', 2), ('Show', 2), ('Job', 1), ('Id', 1), ('22598145The', 1), ('Lead', 1), ('Analyst', 1), ('professional', 1), ('stays', 1), ('abreast', 1), ('developments', 1), ('field', 1), ('contributes', 1), ('directional', 1), ('strategy', 1), ('considering', 1), ('Recognized', 1), ('technical', 1), ('authority', 1), ('Requires', 1), ('basic', 1), ('commercial', 1), ('awareness', 1), ('typically', 1), ('people', 1), ('level', 1), ('subject', 1), ('matter', 1), ('expertise', 1), ('Developed', 1), ('guide', 1), ('influence', 1), ('convince', 1), ('colleagues', 1), ('occasional', 1), ('external', 1), ('customers', 1), ('Significant', 1), ('deliverables', 1), ('Provides', 1), ('advice', 1), ('counsel', 1), ('operations', 1), ('Work', 1), ('entire', 1), ('eventually', 1), ('affects', 1), ('overall', 1), ('effectiveness', 1), ('sub-function/job', 1), ('family.Responsibilities', 1), ('Conducts', 1), ('insights', 1), ('make', 1), ('recommendations', 1), ('develops', 1), ('displays', 1), ('clearly', 1), ('communicate', 1), ('analysis.Mines', 1), ('analyzes', 1), ('various', 1), ('banking', 1), ('platforms', 1), ('drive', 1), ('optimization', 1), ('improve', 1), ('quality.Deliver', 1), ('analytics', 1), ('address', 1), ('problems', 1), ('ability', 1), ('determine', 1), ('effort', 1), ('establish', 1), ('plan.Consults', 1), ('specifications', 1), ('Applies', 1), ('collectively', 1), ('integrate', 1), ('contribute', 1), ('towards', 1), ('achieving', 1), ('goals.Consults', 1), ('users', 1), ('solve', 1), ('issues/problems', 1), ('evaluation', 1), ('processes', 1), ('standards', 1), ('recommends', 1), ('solutions.Leads', 1), ('change', 1), ('operational', 1), ('support', 1), ('users.Formulates', 1), ('defines', 1), ('goals', 1), ('projects', 1), ('research', 1), ('fact-finding', 1), ('combined', 1), ('standards.Impacts', 1), ('directly', 1), ('ensuring', 1), ('quality', 1), ('provided', 1), ('self', 1), ('team', 1), ('closely', 1), ('teams.Considers', 1), ('environment', 1), ('communicates', 1), ('impacts.Drives', 1), ('leaders', 1), ('exhibits', 1), ('exchange', 1), ('information.Conduct', 1), ('develop', 1), ('cases', 1), ('test', 1), ('plans', 1), ('assist', 1), ('acceptance', 1), ('testing.Collaborate', 1), ('design', 1), ('long', 1), ('term', 1), ('scalability', 1), ('reliability', 1), ('reporting.Develop', 1), ('knowledge', 1), ('proficiency', 1), ('supported', 1), ('engage', 1), ('partners', 1), ('evaluating', 1), ('refinement.Gather', 1), ('across', 1), ('SectorsPartner', 1), ('cross', 1), ('teams', 1), ('analyze', 1), ('deconstruct', 1), ('map', 1), ('state', 1), ('improvement', 1), ('creation', 1), ('target', 1), ('operation', 1), ('models.Assist', 1), ('negotiating', 1), ('resources', 1), ('owned', 1), ('completed', 1), ('scheduleDevelop', 1), ('maintain', 1), ('documentation', 1), ('ongoing', 1), ('basis', 1), ('train', 1), ('new', 1), ('existing', 1), ('usersDirect', 1), ('issue', 1), ('disposition', 1), ('stakeholders', 1), ('Senior', 1), ('ManagementDirect', 1), ('identification', 1), ('delivery', 1), ('mitigation', 1), ('developed', 1), ('executed', 1), ('necessaryEnsure', 1), ('flow', 1), ('case', 1), ('cost', 1), ('benefit', 1), ('analyses', 1), ('line', 1), ('objectivesDeliver', 1), ('coherent', 1), ('concise', 1), ('communications', 1), ('detailing', 1), ('progress', 1), ('results', 1), ('underwayDevelop', 1), ('reduce', 1), ('costs', 1), ('manage', 1), ('enhance', 1), ('servicesDeploy', 1), ('influencing', 1), ('matrix', 1), ('management', 1), ('meet', 1), ('requirementsPerforms', 1), ('functions', 1), ('assigned.Appropriately', 1), ('decisions', 1), ('made', 1), ('demonstrating', 1), ('firm', 1), ("'s", 1), ('reputation', 1), ('safeguarding', 1), ('Citigroup', 1), ('assets', 1), ('driving', 1), ('compliance', 1), ('laws', 1), ('regulations', 1), ('adhering', 1), ('applying', 1), ('ethical', 1), ('judgment', 1), ('regarding', 1), ('personal', 1), ('behavior', 1), ('conduct', 1), ('practices', 1), ('escalating', 1), ('managing', 1), ('reporting', 1), ('control', 1), ('issues', 1), ('transparency.Qualifications', 1), ('MBA', 1), ('Advanced', 1), ('Degree', 1), ('Information', 1), ('Systems', 1), ('Analysis', 1), ('Computer', 1), ('Science6-10', 1), ('years', 1), ('using', 1), ('statistical', 1), ('large', 1), ('setsProcess', 1), ('Improvement', 1), ('Project', 1), ('Management', 1), ('experienceEducation', 1), ('Bachelor', 1), ('s/University', 1), ('degree', 1), ('equivalent', 1), ('potentially', 1), ('Masters', 1), ('degreeThis', 1), ('description', 1), ('high-level', 1), ('types', 1), ('performed', 1), ('job-related', 1), ('may', 1), ('assigned', 1), ('required.', 1), ('Group', 1), ('Technology', 1), ('Time', 1), ('Type', 1), ('Full', 1), ('Irving', 1), ('Texas', 1), ('United', 1), ('States', 1), ('Salary', 1), ('Range', 1), ('121,560.00', 1), ('182,340.00', 1), ('equal', 1), ('affirmative', 1), ('action', 1), ('employer.Qualified', 1), ('receive', 1), ('without', 1), ('regard', 1), ('race', 1), ('color', 1), ('religion', 1), ('sex', 1), ('sexual', 1), ('orientation', 1), ('gender', 1), ('identity', 1), ('national', 1), ('origin', 1), ('protected', 1), ('veteran.Citigroup', 1), ('Inc.', 1), ('subsidiaries', 1), ('”', 1), ('invite', 1), ('qualified', 1), ('interested', 1), ('person', 1), ('need', 1), ('reasonable', 1), ('accommodation', 1), ('search', 1), ('and/or', 1), ('Accessibility', 1), ('Citi.View', 1), ('poster', 1), ('View', 1), ('Supplement.View', 1), ('Statement.View', 1), ('Pay', 1), ('Transparency', 1), ('Posting', 1), ('less', 1)]

# Top twenty words for Charles Schwab - Data Scientist
print(Content_words[1][0:20]);

[('data', 14), ('business', 7), ('experience', 5), ('understand', 4), ('algorithms', 4), ('NLP', 4), ('Schwab', 3), ('across', 3), ('apply', 3), ('partners', 3), ('including', 3), ('Dialogflow', 3), ('models', 3), ('years', 3), ('building', 3), ('Knowledge', 3), ('make', 2), ('innovative', 2), ('creative', 2), ('using', 2)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[1][0:20])):
    unique_word.append(Content_words[1][i][0]);
    word_count.append(Content_words[1][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Charles Schwab - Data Scientist - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Strker - Data Scientist - Sales Operations (Remote)
print(Content_words[2][0:20]);

[('data', 10), ('information', 5), ('Stryker', 4), ('based', 4), ('science', 4), ('operations', 4), ('experience', 4), ('one', 3), ('healthcare', 3), ('programs', 3), ('Data', 3), ('improve', 3), ('requirements', 3), ('work', 3), ('opportunities', 3), ('business', 3), ('Azure', 3), ('Power', 3), ('BI', 3), ("'s", 3)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[2][0:20])):
    unique_word.append(Content_words[2][i][0]);
    word_count.append(Content_words[2][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Strker - Data Scientist - Sales Operations (Remote) - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for AEGIS Hedging - Python/R Quantitative Developer
print(Content_words[3][0:20]);

[('data', 13), ('financial', 9), ('models', 6), ('AEGIS', 5), ('modeling', 5), ('analysis', 5), ('solutions', 4), ('tools', 4), ('Python/R', 4), ('Experience', 4), ('Hedging', 3), ('employees', 3), ('experience', 3), ('team', 3), ('pipelines', 3), ('maintain', 3), ('software', 3), ('related', 3), ('including', 3), ('time', 3)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[3][0:20])):
    unique_word.append(Content_words[3][i][0]);
    word_count.append(Content_words[3][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('AEGIS Hedging - Python/R Quantitative Developer - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Aditi Consulting - Machine Learning Engineer
print(Content_words[4][0:20]);

[('data', 7), ('skills', 6), ('team', 4), ('pipelines', 4), ('work', 4), ('large', 3), ('SQL', 3), ('Excellent', 3), ('problem-solving', 3), ('critical', 3), ('thinking', 3), ('Ability', 3), ('well', 3), ('environment', 3), ('effectively', 3), ('communicate', 3), ('technical', 3), ('concepts', 3), ('non-technical', 3), ('stakeholders', 3)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[4][0:20])):
    unique_word.append(Content_words[4][i][0]);
    word_count.append(Content_words[4][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Aditi Consulting - Machine Learning Engineer - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Gartner - Data Scientist 
print(Content_words[5][0:20]);

[('data', 10), ('science', 7), ('e.g.', 6), ('product', 5), ('business', 5), ('Gartner', 5), ('products', 4), ('may', 4), ('learning', 4), ('opportunities', 4), ('development', 4), ('place', 4), ('status', 4), ('join', 3), ('organization', 3), ('work', 3), ('new', 3), ('design', 3), ('client', 3), ('great', 3)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[5][0:20])):
    unique_word.append(Content_words[5][i][0]);
    word_count.append(Content_words[5][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Gartner - Data Scientist - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Concurrency Inc - Data Scientist
print(Content_words[6][0:20]);

[('data', 12), ('Data', 6), ('machine', 5), ('learning', 5), ('technical', 4), ('Learning', 4), ('development', 4), ('solutions', 3), ('customer', 3), ('needs', 3), ("'ll", 3), ('Machine', 3), ('science', 3), ('models', 3), ('work', 3), ('change', 2), ('inspired', 2), ('technology', 2), ('team', 2), ('status', 2)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[6][0:20])):
    unique_word.append(Content_words[6][i][0]);
    word_count.append(Content_words[6][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Gartner - Data Scientist - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Quantlab Group - Quantitative Developer
print(Content_words[7][0:20]);

[('Quantlab', 9), ('work', 5), ('trading', 4), ('written', 3), ('including', 3), ('paid', 3), ('resumes', 3), ('search', 3), ('firms', 3), ('team', 2), ('creating', 2), ('tools', 2), ('daily', 2), ('candidate', 2), ('Houston', 2), ('systems', 2), ('technical', 2), ('science', 2), ('math', 2), ('development', 2)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[7][0:20])):
    unique_word.append(Content_words[7][i][0]);
    word_count.append(Content_words[7][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Quantlab Group - Quantitative Developer - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Tata Consultancy Services - Data Scientist
print(Content_words[8][0:20]);

[('analysis', 4), ('data', 4), ('Job', 3), ('Neo4j', 3), ('fraud', 3), ('graph', 3), ('TCS', 2), ('Tata', 2), ('ETL', 2), ('procedures', 2), ('SQL', 2), ('SAS', 2), ('R', 2), ('science', 2), ('using', 2), ('results', 2), ('Show', 2), ('Consultancy', 1), ('Services', 1), ('Indian', 1)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[8][0:20])):
    unique_word.append(Content_words[8][i][0]);
    word_count.append(Content_words[8][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Tata Consultancy Services - Data Scientist - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Google - AI Consultant, Google Cloud
print(Content_words[9][0:20]);

[('USA', 12), ('technical', 9), ('Google', 8), ('customers', 6), ('CA', 5), ('client', 5), ('manage', 5), ('solutions', 5), ('customer', 5), ('role', 4), ('also', 4), ('location', 4), ('Cloud', 4), ('business', 4), ('technology', 4), ('cloud', 4), ('benefits', 4), ('best', 4), ('salary', 4), ('range', 4)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[9][0:20])):
    unique_word.append(Content_words[9][i][0]);
    word_count.append(Content_words[9][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Google - AI Consultant, Google Cloud - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Balyasny Asset Management L.P. - Investment Data Analyst - Equities
print(Content_words[10][0:20]);

[('data', 15), ('Data', 5), ('working', 5), ('experience', 5), ('team', 3), ('content', 3), ('across', 2), ('BAM', 2), ('serve', 2), ('quantitative', 2), ('assist', 2), ('new', 2), ('datasets', 2), ('requirements', 2), ('ownership', 2), ('onboarding', 2), ('closely', 2), ('related', 2), ('years', 2), ('and/or', 2)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[10][0:20])):
    unique_word.append(Content_words[10][i][0]);
    word_count.append(Content_words[10][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Balyasny Asset Management L.P. - Investment Data Analyst - Equities - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Biamp - AI Opportunities Analyst
print(Content_words[11][0:20]);

[('AI', 12), ('Biamp', 10), ('business', 7), ('role', 5), ('work', 4), ('solutions', 4), ('people', 4), ('audiovisual', 3), ('stakeholders', 3), ('tools', 3), ('related', 3), ('team', 3), ('great', 3), ('believe', 3), ('Opportunities', 2), ('Analyst', 2), ('ways', 2), ('leverage', 2), ('technology', 2), ('improve', 2)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[11][0:20])):
    unique_word.append(Content_words[11][i][0]);
    word_count.append(Content_words[11][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Biamp - AI Opportunities Analyst - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Amherst - Financial Data Scientist
print(Content_words[12][0:20]);

[('business', 4), ('modeling', 3), ('team', 3), ('teams', 3), ('analytics', 2), ('research', 2), ('project', 2), ('strong', 2), ('learn', 2), ('new', 2), ('quantitative', 2), ('skills', 2), ('experience', 2), ('real', 2), ('estate', 2), ('Show', 2), ('Responsibilities', 1), ('Support', 1), ('production', 1), ('supportMaintain', 1)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[12][0:20])):
    unique_word.append(Content_words[12][i][0]);
    word_count.append(Content_words[12][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Amherst - Financial Data Scientist - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Texas Capital Bank - Data Analyst
print(Content_words[13][0:20]);

[('data', 13), ('experience', 10), ('skills', 9), ('management', 8), ('analysis', 5), ('information', 4), ('requirements', 4), ('team', 4), ('Data', 3), ('business', 3), ('using', 3), ('analyze', 3), ('statistical', 3), ('including', 3), ('knowledge', 3), ('work', 3), ('time', 3), ('systems', 2), ('include', 2), ('insight', 2)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[13][0:20])):
    unique_word.append(Content_words[13][i][0]);
    word_count.append(Content_words[13][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Texas Capital Bank - Data Analyst - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Abbott - Data Scientist
print(Content_words[14][0:20]);

[('data', 8), ('Abbott', 6), ('company', 6), ('people', 5), ('work', 5), ('analysis', 5), ('experience', 5), ('medical', 4), ('career', 4), ('healthcare', 3), ('health', 3), ('retirement', 3), ('business', 3), ('etc', 3), ('eg', 3), ('programs', 3), ('global', 2), ('leader', 2), ('live', 2), ('fully', 2)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[14][0:20])):
    unique_word.append(Content_words[14][i][0]);
    word_count.append(Content_words[14][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Abbott - Data Scientist - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Medpace - Statistical Analyst - Experienced
print(Content_words[15][0:20]);

[('Medpace', 5), ('clinical', 4), ('work', 4), ('statistical', 3), ('analysis', 3), ('study', 3), ('development', 3), ('local', 3), ('expertise', 3), ('across', 3), ('30', 3), ('programs', 2), ('methods', 2), ('review', 2), ('data', 2), ('key', 2), ('Knowledge', 2), ('pharmaceutical', 2), ('CRO', 2), ('medical', 2)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[15][0:20])):
    unique_word.append(Content_words[15][i][0]);
    word_count.append(Content_words[15][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Medpace - Statistical Analyst - Experienced - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for Strategic Staffing Solutions - Data Scientist (Remote)
print(Content_words[16][0:20]);

[('Data', 6), ('S3', 5), ('Expert', 5), ('business', 4), ('Insurance', 4), ('Scientist', 3), ('Science', 3), ('SQL', 3), ('MS', 3), ('–', 3), ('Intermediate', 3), ('!', 2), ('Corp', 2), ('Remote', 2), ('#', 2), ('Develops', 2), ('complex', 2), ('models', 2), ('and/or', 2), ('customers', 2)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[16][0:20])):
    unique_word.append(Content_words[16][i][0]);
    word_count.append(Content_words[16][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('Strategic Staffing Solutions - Data Scientist (Remote) - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for SoFi - Staff Data Scientist - Machine Learning
print(Content_words[17][0:20]);

[('learning', 10), ('machine', 8), ('Risk', 6), ('models', 6), ('work', 5), ('support', 5), ('business', 5), ('SoFi', 4), ('Engineering', 4), ('etc', 4), ('complex', 4), ('skills', 4), ('financial', 3), ('Data', 3), ('various', 3), ('credit', 3), ('closely', 3), ('Product', 3), ('teams', 3), ('solve', 3)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[17][0:20])):
    unique_word.append(Content_words[17][i][0]);
    word_count.append(Content_words[17][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('SoFi - Staff Data Scientist - Machine Learning - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for incedo - Machine Learning Engineer
print(Content_words[18][0:20]);

[('work', 3), ('experience', 3), ('knowledge', 3), ('thinking', 3), ('across', 2), ('Services', 2), ('developing', 2), ('client', 2), ('skills', 2), ('data', 2), ('systems', 2), ('business', 2), ('distributed', 2), ('libraries', 2), ('machine', 2), ('learning', 2), ('Python', 2), ('creative', 2), ('problem', 2), ('solving', 2)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[18][0:20])):
    unique_word.append(Content_words[18][i][0]);
    word_count.append(Content_words[18][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('incedo - Machine Learning Engineer - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

# Top twenty words for AE Studio - Data Scientist
print(Content_words[19][0:20]);

[('equity', 11), ('AE', 9), ('!', 9), ('projects', 8), ('$', 8), ('?', 7), ('work', 6), ('agency', 5), ('...', 5), ('Skunkworks', 5), ('human', 4), ('may', 4), ('data', 4), ('one', 4), ('client', 4), ('value', 4), ('receive', 4), ('free', 4), ('“', 3), ('”', 3)]

unique_word=[];
word_count=[];
for i in range(len(Content_words[19][0:20])):
    unique_word.append(Content_words[19][i][0]);
    word_count.append(Content_words[19][i][1]);
    
plt.bar(unique_word,word_count);
plt.title('AE Studio - Data Scientist - Job post');
plt.xlabel("Top Twenty Content Words");
plt.ylabel("Frequency");
plt.xticks(rotation=45,ha='right');
plt.show();

Sentiment Analysis of Data Science Job Listing on LinkedIn

Southern Methodist University

By Barry Daemi

Updated: May 23, 2023

1. Proof of Concept

2. Automate Concept and Full Data Analysis

Citations:

	Date	Company	Industry	Position	Experience	Employment	Salary	Job_description	Content_words	Website
0	Two weeks ago	Citi	Financial Services	Data Scientist	Senior	Full-Time	121,256/yr-182,340/yr	\n\n\n Job Id: 22598145The Data Science...	[(business, 23), (data, 6), (complex, 5), (wor...	https://jobs.citi.com/job/-/-/287/42525672736?...
1	One week ago	Charles Schwab	Financial Services	Data Scientist	Entry	Full-Time	81,800/yr-167,200/yr	\n\n\n Your OpportunityAt Schwab, you’r...	[(data, 14), (business, 7), (experience, 5), (...	https://www.linkedin.com/jobs/view/3613370923/...
2	One day ago	Strker	Medical Equipment Manufacturing	Data Scientist - Sales Operations (Remote)	Associate	Full-Time	83,000/yr-176,800	\n\n\n Why join Stryker?We are proud to...	[(data, 10), (information, 5), (Stryker, 4), (...	https://www.linkedin.com/jobs/view/3613108464/...
3	One week ago	AEGIS Hedging	Financial Services	Python/R Quantitative Developer	Senior	Full-Time	Not-posted	\n\n\nCompany: AEGIS Hedging SolutionsAEGIS si...	[(data, 13), (financial, 9), (models, 6), (AEG...	https://www.linkedin.com/jobs/view/3599977799/...
4	Three weeks ago	Aditi Consulting	IT Services and IT Consulting	Machine Learning Engineer	Mid-Senior	Contract	115,200/yr-134,400/yr	\n\n\nDetails/Scope of the project: We are see...	[(data, 7), (skills, 6), (team, 4), (pipelines...	https://www.linkedin.com/jobs/view/3582138574/...
5	Two days ago	Gartner	Information	Data Scientist	Associate	Full-Time	Not-posted	\n\n\nAbout the role: This is a unique opportu...	[(data, 10), (science, 7), (e.g., 6), (product...	https://www.linkedin.com/jobs/view/3477165235/...
6	One Month	Concurrency Inc	IT Services and IT Consulting	Data Scientist	Entry	Full-Time	Not-posted	\n\n\nWho We AreWe are change agents. We are i...	[(data, 12), (Data, 6), (machine, 5), (learnin...	https://www.linkedin.com/jobs/view/3576885404/...
7	Three days ago	Quantlab Group	Financial Services	Quantitative Developer	Associate	Full-Time	146,000/yr-194,000/yr	\n\n\nWe are seeking a Quantitative Developer ...	[(Quantlab, 9), (work, 5), (trading, 4), (writ...	https://www.linkedin.com/jobs/view/3533888707/...
8	Three days ago	Tata Consultancy Services	IT Services and IT Consulting	Data Scientist	Mid-Senior	Full-Time	Not-posted	\n\n\nAbout TCS :Tata Consultancy Services is ...	[(analysis, 4), (data, 4), (Job, 3), (Neo4j, 3...	https://www.linkedin.com/jobs/view/3606141065/...
9	Three days ago	Google	Technology, Information and Internet	AI Consultant, Google Cloud	Senior	Full-Time	215,000/yr-315,000/yr	\n\n\n This role may also be located in...	[(USA, 12), (technical, 9), (Google, 8), (cust...	https://www.linkedin.com/jobs/view/3583462407/...
10	One month ago	Balyasny Asset Management L.P.	Investment Management	Investment Data Analyst - Equities	Associate	Full-Time	Not-posted	\n\n\nROLE OVERVIEWWithin a global team of dat...	[(data, 15), (Data, 5), (working, 5), (experie...	https://www.linkedin.com/jobs/view/3535282529/...
11	One week ago	Biamp	Appliances, Electrical, and Electronics Manufa...	AI Opportunities Analyst	Entry	Full-Time	Not-posted	\n\n\nThe role, at a glance:   The AI Opportun...	[(AI, 12), (Biamp, 10), (business, 7), (role, ...	https://www.linkedin.com/jobs/view/3580011415/...
12	Five days ago	Amherst	Investment Management	Financial Data Scientist	Mid-Senior	Full-Time	Not-post	\n\n\nResponsibilities:Support modeling analyt...	[(business, 4), (modeling, 3), (team, 3), (tea...	https://www.linkedin.com/jobs/view/3575142950/...
13	Six days ago	Texas Capital Bank	Banking	Data Analyst	Mid-Senior	Full-Time	Not-posted	\n\n\nA Data Analyst collects data about an or...	[(data, 13), (experience, 10), (skills, 9), (m...	https://www.linkedin.com/jobs/view/3603254581/...
14	One months ago	Abbott	Hospitals and Healthcare	Data Scientist	Associate	Full-Time	Not-posted	\n\n\n Abbott is a global healthcare le...	[(data, 8), (Abbott, 6), (company, 6), (people...	https://www.linkedin.com/jobs/view/3607337417/...
15	Two weeks ago	Medpace	Pharmaceutical Manufacturing	Statistical Analyst - Experienced	Mid-Senior	Full-Time	Not-posted	\n\n\nResponsibilities Write statistical progr...	[(Medpace, 5), (clinical, 4), (work, 4), (stat...	https://www.linkedin.com/jobs/view/3595564767/...
16	One week ago	Strategic Staffing Solutions	IT Services and IT Consulting	Data Scientist (Remote)	Entry	Contract	Not-posted	\n\n\n STRATEGIC STAFFING SOLUTIONS (S3...	[(Data, 6), (S3, 5), (Expert, 5), (business, 4...	https://www.linkedin.com/jobs/view/3601784881/...
17	One week ago	SoFi	Financial Services	Staff Data Scientist - Machine Learning	Senior	Full-Time	162,000/yr-247,500/yr	\n\n\nEmployee Applicant Privacy NoticeWho we ...	[(learning, 10), (machine, 8), (Risk, 6), (mod...	https://www.linkedin.com/jobs/view/3602626868/...
18	One month ago	incedo	Information Technology & Services	Machine Learning Engineer	Mid-Senior	Full-Time	Not-posted	\n\n\nJob Title: Machine Learning Engineer Loc...	[(work, 3), (experience, 3), (knowledge, 3), (...	https://www.linkedin.com/jobs/view/3568734219/...
19	One month ago	AE Studio	Software Development	Data Scientist	Entry	Full-Time	120000/yr-220000/yr	\n\n\n AE Studio is an LA-based company...	[(equity, 11), (AE, 9), (!, 9), (projects, 8),...	https://www.linkedin.com/jobs/view/3586405131/...

Sentiment Analysis of Data Science Job Listing on LinkedInSouthern Methodist UniversityBy Barry DaemiUpdated: May 23, 2023

1. Proof of Concept

2. Automate Concept and Full Data Analysis

Citations:

Sentiment Analysis of Data Science Job Listing on LinkedIn

Southern Methodist University

By Barry Daemi

Updated: May 23, 2023