Ethical concerns mount as AI takes bigger decision-making role in more industries



nlp with neural networks :: Article Creator

Artificial Intelligence Or Machine Learning: What's Right For Your Business?

Cory McNeley is a Managing Director at UHY Consulting.

getty

Artificial intelligence (AI) has transformed the business landscape and changed how we work. Its capability to automate tasks, analyze extensive datasets efficiently and provide concise business insights facilitates both the speed and quality of business operations.

"Artificial intelligence" is often used to describe other technologies, such as machine learning (ML) and deep learning (DL). However, each of these technologies is distinct, and those differences impact which solution is right for your specific challenges. Understanding the high-level differences between each and the challenges that remain with implementation and adoption can help you have more meaningful and direct conversations about the role of these technologies in your organization.

Defining Artificial Intelligence And Machine Learning

AI is centered on programs that replicate common human-like skills. AI can solve problems, perform advanced calculations, and make decisions through the use of statistical models, neural networks and programmed rules. AI is an umbrella term that also includes various subsets of technology like ML and DL.

ML allows programs to identify patterns from data, which is used to enhance the program's performance over time without the need for explicit programming. Common learning models include supervised, unsupervised and reinforcement learning techniques. This subset of AI is especially useful for data-driven decisions with extremely large data sets, such as sales forecasting. DL uses neural networks, a technique to replicate the human brain that is commonly found in image recognition and detection systems, as well as advanced AI applications such as autonomous vehicles.

Business Applications Of AI And ML

The world of communications, marketing and customer service is experiencing major disruption as a result of advancements in AI. Commercially available and custom-developed AI tools are helping companies provide high levels of customer service by employing advanced chatbots with more knowledge and flexibility than traditional chatbots. They can dissect and resolve complex inquiries without the need for human intervention. The natural language processing (NLP) aspect of modern AI allows these tools to provide customized marketing and communications that are reactive and continually evolving.

Common applications of ML technology include hyper-segmented customer profiling, predictive maintenance and fraud detection. Each of these is based on labeled (structured data), unlabeled (unstructured data) and reinforcement learning, where prior outputs are evaluated and used as inputs to adjust and refine ML's results.

Profiling customers based on previous purchasing habits, location, household income, etc., is nothing new, but combining this data with commuting route data, weather forecasts and social media activity could yield more valuable insight and recommendations.

In predictive maintenance, the mean time to failure by specific machine and physical location in the building—even down to floor orientation—along with machine models with common parts, operator assigned and forecasted demand help management address problems proactively and optimize scheduled downtime.

In fraud detection and prevention, customer profiling, institutional data, travel plans and social media help find potential fraud. Previously, major credit card processors used only a few dozen measures to predict fraud. Today, using ML, the number of parameters the card processor considers is far higher, likely reaching into the hundreds.

Challenges And Learning Curves

There are challenges with implementing any of these technologies. Data quality ranks as the No. 1 issue. Similar to humans, bad information drives poorly informed decisions from AI. Businesses that plan on implementing any advanced AI tools need to review, catalog and cleanse their data to minimize potential issues with the tool.

Another major issue revolves around acquiring the right talent to work with these tools. According to the Bureau of Labor Statistics, data scientist jobs are projected to increase 36% from 2023 to 2033. With the high demand for expertise in this field, the difficulty in finding skilled and qualified talent to build and deploy could be increasingly difficult with the rising trends of adoption.

Several misconceptions about AI are also prevalent in organizations. While some solutions could be deemed plug-and-play, the vast majority require continual refinement and fine-tuning. This results in unrealistic expectations of what AI can and cannot do for your organization. Before you embark on your AI journey, clearly define your goals and objectives. Then, complete a detailed analysis to ensure the tool you are deploying will yield the expected results. Failed implementations could lead to cynical thinking about AI's capabilities.

Conclusion

Whether AI or ML is right for your organization depends on context, and today's tools are advancing rapidly. At the core of the matter is data. These solutions need quality data to operate effectively. Is your organization ready?

Forbes Business Council is the foremost growth and networking organization for business owners and leaders. Do I qualify?


Thomson Reuters V. Ross Intelligence: Copyright, Fair Use, And AI (Round One)

Competitor's use of copyrighted material to train a legal research AI tool was not fair use—but questions remain for other AI cases

Earlier this week, a federal judge rejected an AI startup's claim that using copyrighted material to train its AI system was permissible under the fair use doctrine. The decision—Thomson Reuters Enterprise Centre GmbH v. Ross Intelligence Inc., No. 1:20-CV-613-SB (D. Del. Feb. 11, 2025)—marks the first time a court has rejected a fair use defense in this context.

The district court's ruling is surely only the first round in the ongoing legal battles between rights owners, large language models, and the generative AI industry.

Background

Thomson Reuters has a database of nearly every judicial decision from anywhere in the United States. It creates a headnote—a concise statement of a court's decision on a legal issue—for every issue addressed in every decision. The headnotes give lawyers a quick summary of the points of law in a decision. But in addition to creating headnotes, Thomson Reuters also assigns each headnote a Key Number, based on the specific legal issue the headnote (and thus the decision) deals with. Every decision in the database addressing a given legal issue has a headnote with that same Key Number. Lawyers looking at a decision can use the Key Number System to quickly find other decisions addressing the same legal issues.

Around 2015, Ross Intelligence, Inc. (Ross) began building an AI legal search tool that would use natural language processing (NLP) to let users retrieve decisions relevant to their research by presenting questions in plain, conversational language. To make its NLP AI system work, Ross had to train the AI using a "supervised learning" approach—posing a large number of natural-language legal research questions, looking at the decisions the AI returned in response, and telling the AI whether its responses were correct. This process trains a neural-network-based AI so that over time its responses become increasingly reliable.

This process necessarily requires a preexisting understanding of which decisions are relevant to a given natural language question, and Thomson Reuters' headnotes and Key Number System provided exactly what Ross needed. So, Ross initially asked to license this Westlaw data to train its AI, but Thomson Reuters declined. Ross then engaged a third party, LegalEase Solutions, to generate "Bulk Memos" summarizing a wide range of legal issues and identifying relevant cases. Ross used the Bulk Memos to train its AI.

But LegalEase created the Bulk Memos using Westlaw's resources—including both the headnotes and the Key Number System, which the Bulk Memos closely resembled. After Thomson Reuters learned what Ross had done, it sued Ross in federal court, claiming that these uses infringed copyrights in its headnotes and Key Number System. Ross argued that its use of the Bulk Memos was fair use and therefore not an infringement. Both parties moved for summary judgment. The court initially denied those motions, but, after reconsidering, 3rd Circuit Judge Stephanos Bibas, sitting by designation on the district court, invited both parties to submit renewed motions for summary judgment. After considering the renewed motions, the judge changed his mind and granted partial summary judgment in favor of Thomson Reuters, ruling that what Ross had done was not fair use.

The Decision

The court first ruled that by using the Bulk Memos, Ross had infringed Thomson Reuters' copyrights in the headnotes.[1] It then found that Ross's activity was not fair use.

Direct Copying of Copyrighted Works

Only "original" works are copyrightable. 17 U.S.C. § 102(a). Ross argued that the headnotes were not sufficiently original so that using them could not be infringement. Judge Bibas rejected this claim, concluding that the headnotes—both as individual works and as a compilation—clear the "minimal threshold" of creativity required for copyright protection. The headnotes, he said, "introduce creativity by distilling, synthesizing, or explaining part of an opinion." He not only ruled that headnotes embodying content created by Thomson Reuters were subject to copyright, the judge also ruled that headnotes that merely "quote judicial opinions verbatim" were sufficiently original. He reasoned that even a direct quote from an opinion "is a carefully chosen fraction of the whole," the selection of which "expresses the editor's idea about what the important point of law from the opinion is," regardless of whether or not the underlying judicial opinion is subject to copyright.

The court then addressed whether Ross had copied these original elements in Thomson Reuters' headnotes, finding that of the nearly 3,000 headnotes amenable to resolution on summary judgment, more than 2,200 had been directly infringed. Using a "substantial similarity" analysis, Judge Bibas identified more than 2,200 headnotes with corresponding Bulk Memo questions that "look[ed] more like a headnote than ... The underlying judicial opinion." Next, noting that the substantial similarity inquiry requires evaluating "whether an ordinary user of a product would find [the allegedly infringing work] substantially similar to the copyrighted work," Judge Bibas found that "[a]s a lawyer and a judge," he himself was uniquely "well positioned" to answer that question—which he did, after "having slogged through all 2,830 headnotes," holding that Ross had actually copied 2,243 headnotes "whose language very closely tracks the language of the Bulk Memo question but not the language of the case opinion."

Fair Use

Having found that Ross had copied the headnotes and that the headnotes were subject to copyright, Judge Bibas moved on to Ross's claims that its use of the headnotes was not infringing because it was fair use. He considered in turn each of the four fair use factors. See 17 U.S.C. § 107.

Purpose and Character

Fair use claims often turn on a defendant's argument that the purpose and character of its use of copyrighted material are sufficiently "transformative" to be permissible and non-infringing. Here, Judge Bibas found it difficult to ignore the simple fact that "Ross took the headnotes to make it easier to develop a competing legal research tool." That is, he concluded that Ross was using the headnotes for essentially the same exact purpose for which Thomson Reuters had created them—facilitating legal research. This conclusion drove much of the court's fair use analysis.

An AI—even a legal research AI—does not, itself, read or understand natural language at all, much less natural language questions about legal opinions. Instead, a neural-network based NLP AI converts natural language into a complicated set of "vectors" that express the mathematical relationships among different words and phrases and uses those mathematical relationships to determine which natural language inputs—here, questions about legal issues—correspond most clearly to which natural language outputs—here, legal opinions in the AI's database.

Judge Bibas recognized this when he noted that Ross had turned the headnotes "into numerical data about the relationships among legal words to feed into its AI." But while he believed this made the purpose and character factor "much trickier," he ultimately held that Ross's use was not transformative. The key for Judge Bibas was that the ultimate purpose of Ross's use of the headnotes was to create an AI search tool that "retrieves judicial opinions"—which is exactly what Thomson Reuters' headnotes and Key Number System are designed to do. Consequently, Judge Bibas concluded that Ross's use of the headnotes did not "have a 'further purpose or different character' from Thomson Reuters's."

This aspect of the ruling is the key factor that may distinguish it from other pending AI copyright cases: whether using copyrighted material to train a generative AI model—one that creates new material as output, rather than simply retrieving preexisting material—would be a transformative use. The judge himself recognized this distinction, stressing that "only non-generative AI" was at issue.

Finally, on the "purpose and character" factor, Judge Bibas also rejected Ross's argument that its copying of the headnotes occurred only at a permissible "intermediate step"—training queries for its AI. He distinguished the cases supporting that theory on the grounds that they all concerned computer code, not "written words." In his view, intermediate copying in the code cases was "necessary" to ensure compatibility between computer programs.

Nature of the Work

Although Judge Bibas conceded that the perhaps minimally creative nature of Thomson Reuters' headnotes tipped the scales on this factor in Ross's favor, he downplayed it, noting that this factor "rarely play[s] a significant role in the determination of a fair use dispute."

Amount and Substantiality

This factor also favored Ross. Here, Judge Bibas held that it did not matter whether the unpublished Bulk Memo inputs had copied all or substantial parts of Thomson Reuters' headnotes. What mattered was the fact that Ross's public-facing outputs did not include the protected headnotes at all.

Market Effect

Judge Bibas found this to be the most important fair use factor in this case and found that it favored Thomson Reuters. In considering the effects of Ross's copying, Judge Bibas evaluated both the existing market for legal research platforms and a potential derivative market for creating legal-AI training data. Similar to his analysis of the "purpose and character" of Ross's use, Judge Bibas here zeroed in on the fact that Ross "meant to compete with Westlaw by developing a market substitute."

What's Next NLPs vs LLMs

As suggested above, the importance of this case in the ongoing conflicts between AI developers and copyright holders may be limited by the differences between both the data required to train NLP models versus LLMs and the ways in which these two types of AI are used. NLP-based tools are designed to enhance research by prioritizing the machine's comprehension and analysis of specific human-generated queries rather than generating human-like text. To accomplish this precise information retrieval, NLP systems use supervised learning, requiring carefully curated and labeled training data to ensure precise and contextually relevant results. LLMs, on the other hand, primarily use unsupervised learning based on vast amounts of unlabeled text data, identifying language patterns without the need for the annotated datasets utilized by Ross.

A Landmark Moment for Fair Use?

Thomson Reuters represents the first significant court opinion on whether the fair use doctrine protects the use of copyrighted materials to train an AI. The ongoing surge in generative AI applications has presented federal courts with numerous copyright disputes involving computer code, music, literature, images, and artwork. Stakeholders across the AI ecosystem are keen to extract relevant insights from these rulings.

However, architects, developers, and others involved with generative AI may have different fair use defenses than Ross's. Judge Bibas specifically confined his analysis to non-generative AI scenarios. Unlike Ross, AI developers using copyrighted material to train generative AI models—which produce original (though, at present, not copyrightable) content—may argue that their use is transformative. It remains to be seen how courts will respond to those arguments.

Even so, Thomson Reuters does highlight the continued need for AI developers to exercise caution when deciding what to use for training data. The financial burden of this lawsuit forced Ross to cease operations. AI developers should use uncopyrighted data when possible, obtain licenses if available, and, when obtaining training data from a third party, always seek indemnification against claims that using the material to train an AI would violate IP rights. In Thomson Reuters, even though Ross asserted that it had no control over LegalEase and had no knowledge of its use of Westlaw's copyrighted headnotes, the court found Ross to have infringed Thomson Reuters' copyrights. This, too, suggests that developers should seek audit rights and additional warranties around the provenance and quality of training data that they acquire from others.

Finally, this ruling may change the depth of due diligence necessary for seed investors and venture capitalists as they consider financing AI startups. This would mark a shift away from founder-centric diligence and could potentially limit access to capital for boundary-pushing startups.

Implications for Copyright Holders

For these same reasons, copyright holders should not be overly confident as a result of this ruling. The type of copyrighted work involved in Thomson Reuters occupies a unique position between largely functional computer code and more expressive works. Also interesting is the fact that Judge Bibas noted that he and other members of the judiciary are regular users of Westlaw's technology. In cases that do not involve legal research tools, judges may be less familiar with the technology and, therefore, might be less inclined to decide fair use cases on summary judgment rather than sending them to a jury.

[1] The court did not resolve the infringement claims regarding Ross's use of the Key Number System or regarding approximately five hundred judicial opinions that apparently contained or reflected Thomson Reuters' own editorial decisions.

[View source.]


What Is Deep Learning?

  • What is deep learning?
  • Types of deep learning models
  • Deep learning applications
  • Difference between machine learning, deep learning, and generative AI
  • Benefits of Deep Learning Models
  • Challenges of Deep Learning Models
  • What is deep learning?

    Deep learning is a type of machine learning that uses deep neural networks with multiple layers—often hundreds or thousands—to process data and make decisions like the human brain. Unlike traditional machine learning, which relies on simpler networks and structured data, deep learning can analyse raw, unstructured data using unsupervised learning. These models continuously refine their outputs for greater accuracy.

    Deep learning powers many AI-driven applications, including digital assistants, fraud detection, self-driving cars, and generative AI, making automation and advanced analytics more efficient.

    Types of deep learning models

    Deep learning models come in various types, each designed for specific tasks:

    Convolutional Neural Networks (CNNs): Specialise in image recognition, object detection, and speech/audio processing by identifying patterns in visual data.

    Recurrent Neural Networks (RNNs): Handle sequential data like speech and text, making them ideal for natural language processing (NLP) and time-series predictions.

    Autoencoders & Variational Autoencoders (VAEs): Compress and reconstruct data, enabling tasks like anomaly detection and generative AI.

    Generative Adversarial Networks (GANs): Generate realistic images, videos, and synthetic data by training a generator and a discriminator in competition.

    Diffusion Models: Produce high-quality images by gradually refining noise, offering stable training and precise control over outputs.

    Transformer Models: Revolutionise language processing with parallel computation, powering tasks like machine translation, text summarisation, and AI-generated content.

    Each model has its strengths and trade-offs, but collectively, they drive innovation in Artificial Intelligence (AI), automation, and deep learning applications.

    Deep learning applications

    Deep learning can be used in a wide variety of applications, including:

  • Image recognition: To identify objects and features in images, such as people, animals, places, etc.
  • Natural language processing: To help understand the meaning of text, such as in customer service chatbots and spam filters.
  • Finance: To help analyse financial data and make predictions about market trends
  • Text to image: Convert text into images, such as in the Google Translate app.
  • Difference between machine learning, deep learning, and generative AI

    The terms machine learning, deep learning, and generative AI indicate a progression in neural network technology.

    Machine Learning (ML)
  • Requires significant human effort to train models.
  • Uses labeled data (supervised learning) to improve accuracy.
  • Struggles with unstructured data like text and images.
  • Deep Learning (DL) – A Subset of ML
  • Processes unstructured data efficiently without manual feature extraction.
  • Discovers hidden patterns beyond its training data.
  • Learns over time without needing large labeled datasets (unsupervised learning).
  • Handles volatile data, like financial transactions, for fraud detection.
  • Generative AI (GenAI) – An Advancement of DL
  • Moves beyond pattern recognition to create new content.
  • Uses transformer-based neural networks to generate unique outputs.
  • Converts and reinterprets text, images, and data into meaningful new patterns.
  • Generative AI represents the next level of deep learning, enabling creativity and content generation rather than just prediction and analysis.

    Benefits of Deep Learning Models

    There are several benefits to using deep learning models, including:

  • Learn complex relationships in data, making them more powerful than traditional ML.
  • Scale effectively by training on large datasets for more accurate predictions.
  • Requires less human intervention, learning continuously from real-time data sources like sensors or social media.
  • Challenges of Deep Learning Models

    Deep learning also has a number of challenges, including:

  • Need large datasets, making them hard to apply in data-scarce situations.
  • Risk of overfitting, where models learn noise instead of meaningful patterns.
  • Can inherit biases from training data, leading to unfair or inaccurate results.





  • Comments

    Follow It

    Popular posts from this blog

    What is Generative AI? Everything You Need to Know

    Top Generative AI Tools 2024

    60 Growing AI Companies & Startups (2025)