Unleashing the power of AI with these useful resources



nlp analysis :: Article Creator

Analysis Of Court Transcripts Reveals Biased Jury Selection

Cornell researchers have shown that data science and artificial intelligence tools can successfully identify when prosecutors question potential jurors differently, in an effort to prevent women and Black people from serving on juries.

In a first-of-its-kind study, researchers used natural language processing (NLP) tools to analyze transcripts of the jury selection process. They found multiple quantifiable differences in how prosecutors questioned Black and white members of the jury pool. Once validated, this technology could provide evidence for appeals cases and be used in real time during jury selection to ensure more diverse juries.

The study, "Quantifying Disparate Questioning of Black and White Jurors in Capital Jury Selection," was published July 14 in the Journal of Empirical Legal Studies. First author is Anna Effenberger.

Striking jurors on the basis of race or gender has been illegal since the Supreme Court's landmark Batson vs. Kentucky case in 1986, but this type of discrimination still occurs, said study co-author John Blume, the Samuel F. Leibowitz Professor of Trial Techniques at Cornell Law School and director of theCornell Death Penalty Project.

"One of the things the courts have looked at is whether the prosecutor questions Black and white jurors differently," Blume said. "NLP software allows you to do that on a much more sophisticated level, looking at not just at the number, but the way in which the questions are put together."

Under the assumption that Black and female jurors will be more sympathetic to a defendant—especially a Black one—prosecutors will sometimes press them to reveal disqualifying information. A common tactic in capital cases is to provide an especially gruesome description of the execution process and then ask if the person would be willing to sentence the defendant to death. If the answer is no, that person is struck from the jury pool.

To see if NLP software could identify this and other signs of disparate questioning, Blume collaborated with Martin Wells, the Charles A. Alexander Professor of Statistical Sciences in the Cornell Ann. S Bowers College of Computing and Information Science, and Effenberger to analyze transcripts from 17 capital cases in South Carolina. Their dataset included more than 26,000 questions that judges, defense attorneys and the prosecution asked potential jurors.

The researchers looked not only at the number of questions asked of Black, white, male and female potential jurors, but also the topics covered, each question's complexity and the parts of speech used.

"We consistently found racial differences in a number of these measures," Wells said. "When we do job interviews, we usually have a list of questions, and we want to ask everyone the same question, and here that's not the case."

The analysis showed significant differences in the length, complexity and sentiment of the questions prosecutors asked of Black potential jurors compared to white ones, indicating they were likely attempting to shape their responses. The questions asked by judges and the defense showed no such racial differences.

The study also found evidence that prosecutors had attempted to disqualify Black individuals by using their views on the death penalty. Prosecutors asked Black potential jurors—especially those who were ultimately excused from serving—more explicit and graphic questions about execution methods compared to white potential jurors.

In six of the 17 cases analyzed in the study, a judge had later ruled that the prosecutor illegally removed potential jurors on the basis of race. By looking at the combined NLP analyses for each case, the researchers could successfully distinguish between cases that violated Batson vs. Kentucky, and ones that hadn't.

The researchers said the findings prove that NLP tools can successfully identify biased jury selection. Now, they hope to see similar studies performed on larger datasets with more diverse types of cases.

Once the validity of this method is established, "this could be done during jury selection almost in real time," Wells said.

Whether used to monitor jury selection or to provide evidence for an appeal, this software could be a powerful tool to diversify juries—especially for defendants who are potentially facing the death penalty.

More information: Anna Effenberger et al, Quantifying disparate questioning of Black and White jurors in capital jury selection, Journal of Empirical Legal Studies (2023). DOI: 10.1111/jels.12357

Citation: Analysis of court transcripts reveals biased jury selection (2023, July 28) retrieved 30 July 2023 from https://phys.Org/news/2023-07-analysis-court-transcripts-reveals-biased.Html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.


Natural Language Processing (NLP): Global Market Analysis And Insights

ReportLinker

Report Scope:. In this report, the global NLP market has been segmented based on organization size, component, provider, type, industry and geography.The report provides an overview of the global natural processing market and analyses market trends.

New York, July 11, 2023 (GLOBE NEWSWIRE) -- Reportlinker.Com announces the release of the report "Natural Language Processing (NLP): Global Market Analysis and Insights" - https://www.Reportlinker.Com/p06474365/?Utm_source=GNW

Using 2022 as the base year, this report provides estimated market data for the forecast period from 2023 to 2028.Revenue forecasts for this period are segmented based on component, deployment, organization size, type, application, industry vertical and geography.

Market values have been estimated based on the total revenue of NLP solution providers.

The report covers the NLP market with regard to the user base across different regions.It also highlights major trends and challenges that affect the market and the vendor landscape.

The report estimates the global market for NLP in 2022 and provides projections for the expected market size through 2028.The report explains the value chain and current trends in the global markets for NLP.

It concludes with detailed profiles of the major market players.

Report Includes:- 52 tables and 24 additional tables- An overview and up-to-date analysis of the global market for the natural language processing (NLP) technologies- Analyses of the global market trends, with historic market revenue data (sales figures) for 2022, estimates for 2023, forecasts for 2024, and projections of compound annual growth rates (CAGRs) through 2028- Highlights of emerging technology trends, opportunities and gaps estimating current size and anticipated growth of the overall NLP market and its related market segments, and identification of the major regions and countries involved in market developments- Estimation of the actual market size and revenue forecast for the global NLP market in USD million values, and corresponding market share analysis based on the component, deployment type, organization size, technology type, application, industry vertical, and region- Updated information on key market drivers and opportunities, industry shifts and regulations, and other region and industry specific macroeconomic variables that will influence this market in the coming years (2023-2028)- Review of patents in the NLP market along with the global NLP based patent applications and patents granted for a brief period- A look at the major growth strategies adopted by leading players operating in the market for NLP technologies, recent developments, strategic alliances, and competitive benchmarking- Identification of the major stakeholders and analysis of the competitive landscape based on recent developments and segmental revenues- Descriptive company profiles of the leading global players of the industry, including 3M, Apple Inc., IBM Corp., Meta Platforms Inc. And SAS Institute Inc.

Summary:The importance of NLP techniques, a subset of artificial intelligence, is rising as its sub-technologies develop.Language is the main tool for interaction and communication.

Communication is vital for the efficient completion of processes, and language is necessary for communication.NLP is a quickly developing technology due to the vast amount of Big Data, customized algorithms and powerful tools at disposal.

NLP can be tackled using various techniques, including rule-based, algorithmic, statistical and machine learning approaches.This is yet another factor contributing to the rising use of NLP across numerous industries.

The use of NLP is expanding the reach of other disciplines. Domains integrating NLP include business, recruitment, finance, healthcare and sports trading.

The global market for NLP was estimated to be valued at $REDACTED billion in 2022.It is projected the NLP market will grow at a compound annual growth rate (CAGR) of REDACTED%, and it is forecast to reach $REDACTED billion by 2028.

Increasing demand for intelligent virtual assistants such as Siri, Alexa and Google Assistant, a growing volume of unstructured data, advancement in machine learning and deep learning and increasing use of application areas are some of the key factors driving the growth of the current NLP market.Data piracy and security concerns, a lack of skilled professionals, language and cultural barriers, and ethical and bias concerns, however, are hindering the market growth.

Apart from drivers and restraints, increasing adoption in healthcare, customer experience enhancement, improving sentiment analysis and social listening, multilingual and cross-cultural applications, compliance and regulatory will create huge opportunities for vendors in the market.

In this report, the NLP market has been segmented based on component, deployment, organization size, type, application, industry vertical and geography.Based on component, the NLP market has been categorized into solution and services.

Solution currently dominates the market, and the segment was valued at $REDACTED billion in 2022. It is estimated the NLP market for solution will grow at a CAGR of REDACTED%, and it is forecast to reach $REDACTED billion by 2028.

Based on deployment, the NLP market has been segmented into cloud and on-premises.Based on organization size, the NLP market has been segmented into large enterprises and SMEs.

Based on type, the NLP market has been segmented into rule-based, statistical and hybrid.Based on application, the NLP market has been segmented into sentiment analysis, social media monitoring, automatic summarization, content management and virtual assistants/chatbots.

Based on industry vertical, the NLP market has been segmented into BFSI, education, healthcare, retail and e-commerce, IT and telecommunication, media and entertainment, and others.

By geography, the NLP market has been segmented into North America, Europe, Asia-Pacific and Rest of World (RoW).North America is currently the most dominant global NLP market.

In 2022, total revenue from the North America NLP market reached $REDACTED billion, which is REDACTED% of the global market. The presence of leading global companies, robust technology infrastructure, favorable political and economic environment, and high adoption of advanced technologies (e.G., AI, IoT, cloud) are among the key factors driving the North American market. Asia-Pacific is currently the fastest-growing market for NLP. The Asia-Pacific NLP market was valued at $REDACTED billion in 2022. It is projected to grow at a CAGR of REDACTED%, and it is forecast to reach $REDACTED billion by 2028.Read the full report: https://www.Reportlinker.Com/p06474365/?Utm_source=GNW

About ReportlinkerReportLinker is an award-winning market research solution. Reportlinker finds and organizes the latest industry data so you get all the market research you need - instantly, in one place.

__________________________

Story continues

CONTACT: Clare: clare@reportlinker.Com US: (339)-368-6001 Intl: +1 339-368-6001

3 Open Source NLP Tools For Data Extraction

Developers and data scientists use generative AI and large language models (LLMs) to query volumes of documents and unstructured data. Open source LLMs, including Dolly 2.0, EleutherAI Pythia, Meta AI LLaMa, StabilityLM, and others, are all starting points for experimenting with artificial intelligence that accepts natural language prompts and generates summarized responses. 

"Text as a source of knowledge and information is fundamental, yet there aren't any end-to-end solutions that tame the complexity in handling text," says Brian Platz, CEO and co-founder of Fluree. "While most organizations have wrangled structured or semi-structured data into a centralized data platform, unstructured data remains forgotten and underleveraged."

If your organization and team aren't experimenting with natural language processing (NLP) capabilities, you're probably lagging behind competitors in your industry. In the 2023 Expert NLP Survey Report, 77% of organizations said they planned to increase spending on NLP, and 54% said their time-to-production was a top return-on-investment (ROI) metric for successful NLP projects.

Use cases for NLP

If you have a corpus of unstructured data and text, some of the most common business needs include

  • Entity extraction by identifying names, dates, places, and products
  • Pattern recognition to discover currency and other quantities
  • Categorization into business terms, topics, and taxonomies
  • Sentiment analysis, including positivity, negation, and sarcasm
  • Summarizing the document's key points
  • Machine translation into other languages
  • Dependency graphs that translate text into machine-readable semi-structured representations
  • Sometimes, having NLP capabilities bundled into a platform or application is desirable. For example, LLMs support asking questions; AI search engines enable searches and recommendations; and chatbots support interactions. Other times, it's optimal to use NLP tools to extract information and enrich unstructured documents and text.

    Let's look at three popular open source NLP tools that developers and data scientists are using to perform discovery on unstructured documents and develop production-ready NLP processing engines.

    Natural Language Toolkit

    The Natural Language Toolkit (NLTK), released in 2001, is one of the older and more popular NLP Python libraries. NLTK boasts more than 11.8 thousand stars on GitHub and lists over 100 trained models.

    "I think the most important tool for NLP is by far Natural Language Toolkit, which is licensed under Apache 2.0," says Steven Devoe, director of data and analytics at SPR. "In all data science projects, the processing and cleaning of the data to be used by algorithms is a huge proportion of the time and effort, which is particularly true with natural language processing. NLTK accelerates a lot of that work, such as stemming, lemmatization, tagging, removing stop words, and embedding word vectors across multiple written languages to make the text more easily interpreted by the algorithms."

    NLTK's benefits stem from its endurance, with many examples for developers new to NLP, such as this beginner's hands-on guide and this more comprehensive overview. Anyone learning NLP techniques may want to try this library first, as it provides simple ways to experiment with basic techniques such as tokenization, stemming, and chunking. 

    spaCy

    spaCy is a newer library, with its version 1.0 released in 2016. SpaCy supports over 72 languages and publishes its performance benchmarks, and it has amassed more than 25,000 stars on GitHub.

    "spaCy is a free, open-source Python library providing advanced capabilities to conduct natural language processing on large volumes of text at high speed," says Nikolay Manchev, head of data science, EMEA, at Domino Data Lab. "With spaCy, a user can build models and production applications that underpin document analysis, chatbot capabilities, and all other forms of text analysis. Today, the spaCy framework is one of Python's most popular natural language libraries for industry use cases such as extracting keywords, entities, and knowledge from text."

    Tutorials for spaCy show similar capabilities to NLTK, including named entity recognition and part-of-speech (POS) tagging. One advantage is that spaCy returns document objects and supports word vectors, which can give developers more flexibility for performing additional post-NLP data processing and text analytics.

    Spark NLP

    If you already use Apache Spark and have its infrastructure configured, then Spark NLP may be one of the faster paths to begin experimenting with natural language processing. Spark NLP has several installation options, including AWS, Azure Databricks, and Docker.

    "Spark NLP is a widely used open-source natural language processing library that enables businesses to extract information and answers from free-text documents with state-of-the-art accuracy," says David Talby, CTO of John Snow Labs. "This enables everything from extracting relevant health information that only exists in clinical notes, to identifying hate speech or fake news on social media, to summarizing legal agreements and financial news.

    Spark NLP's differentiators may be its healthcare, finance, and legal domain language models. These commercial products come with pre-trained models to identify drug names and dosages in healthcare, financial entity recognition such as stock tickers, and legal knowledge graphs of company names and officers.

    Talby says Spark NLP can help organizations minimize the upfront training in developing models. "The free and open source library comes with more than 11,000 pre-trained models plus the ability to reuse, train, tune, and scale them easily," he says.

    Best practices for experimenting with NLP

    Earlier in my career, I had the opportunity to oversee the development of several SaaS products built using NLP capabilities. My first NLP was an SaaS platform to search newspaper classified advertisements, including searching cars, jobs, and real estate. I then led developing NLPs for extracting information from commercial construction documents, including building specifications and blueprints.

    When starting NLP in a new area, I advise the following:

  • Begin with a small but representable example of the documents or text.
  • Identify the target end-user personas and how extracted information improves their workflows.
  • Specify the required information extractions and target accuracy metrics.
  • Test several approaches and use speed and accuracy metrics to benchmark.
  • Improve accuracy iteratively, especially when increasing the scale and breadth of documents.
  • Expect to deliver data stewardship tools for addressing data quality and handling exceptions.
  • You may find that the NLP tools used to discover and experiment with new document types will aid in defining requirements. Then, expand the review of NLP technologies to include open source and commercial options, as building and supporting production-ready NLP data pipelines can get expensive. With LLMs in the news and gaining interest, underinvesting in NLP capabilities is one way to fall behind competitors. Fortunately, you can start with one of the open source tools introduced here and build your NLP data pipeline to fit your budget and requirements.

    Copyright © 2023 IDG Communications, Inc.






    Comments

    Follow It

    Popular posts from this blog

    Dark Web ChatGPT' - Is your data safe? - PC Guide

    Christopher Wylie: we need to regulate artificial intelligence before it ...

    Top AI Interview Questions and Answers for 2024 | Artificial Intelligence Interview Questions