What is Artificial Intelligence? A Step-by-Step Beginners Guide
Testing Language Models With The Philosophy Of Wittgenstein
Nowadays everybody is talking about the new large language models (LLMs), such as GPT. I feel like it's time to talk about a point of view that is too often forgotten while testing them. More than ever, we are confronted with models in various contexts, and it is our job to ensure their reliability, robustness, and unbiasedness.
Too often we rely on some technical method that some expert decided is the best fit for modern AI. What matters now is to question language itself, which forms the basis of LLMs after all. Since I have a bit of a linguistics background, in this article, I'm going to introduce you to some basic ideas demonstrating that some of the best insights might actually not be technical in nature.
Entering the realm of language and technology: who would be more suitable than the analytic philosopher Ludwig Wittgenstein?
Wittgenstein And AIAs one of the most influential thinkers in analytical philosophy, Wittgenstein (1889-1951) contributed a rather engineer-y perspective on language. This comes as no surprise since he was indeed studying mechanics.
"Perhaps, Wittgenstein never became a philosopher but was always a scientist and engineer."
- Nordmann [3]
Especially in the realm of NLP, there has been a growing interest in his work over the last decade, since his ideas help us to formulate the expectations we might or even should have for language-processing AI.
"The solution to any problem in AI may be found in the writings of Wittgenstein, though the details of the implementation are sometimes rather sketchy."
- Duck-Lewis [1]
Created by the author with hotpot.Ai
Expectations might be the central theme here, since QA (including testing) is all about meeting expectations of certain stakeholders. Our expectations of the model should of course match its purpose. Simply put, matching our intention with expectations is what this article is all about.
But this time we will try doing it with the help of language philosophy and see what it means for testing. This might sound harder than it actually is, so let me try breaking down two of Wittgenstein's main ideas and how they relate to testing.
Reviewing Data Sets Through A Lens Of 'Early' Wittgenstein The Limits Of RealityWittgenstein's earlier works, like Tractatus Logico-Philosophicus, can give us valuable insights when we are testing the pre-processing of language data. In that work, he highlights the gap in meaning between a word and the object it is pointing to in the real world. It is easy to imagine for terms like "bird", "dog", and "cat": we are just using an arbitrary symbol to point towards some sensual experience.
The sentence "Ludwig is happy," however, is more complex than a single word. Suddenly, multiple terms point to each other. What do we do here? One way you can understand this sentence is by its decomposition: the symbol "Ludwig" represents a specific person and "happy" a state or property. In logical notation, imagine the expression as
x R y
Now, "R" stands for the relationship of possessing a property while the substitutions "x=Ludwig" and "y=happy" render the original sentence.
This slight detour shows Wittgenstein's belief in how we think about our reality in such compositions.
"The limits of my language means the limits of my world." - Wittgenstein [5]
Applying Decomposition To Language Pre-ProcessingCommon ways for your fellow NLP engineers to restructure ideas expressed as textual data with the intent of preserving their meaning are: tokenization; lemmatization; and removal of stopwords.
If you're not familiar with these three terms yet, their practice ensures consistent formatting of data.
1. Tokenization: breaking the text into individual words or phrases, such as "Ludwig was happy" to "Ludwig", "is", and "happy".
2. Lemmatization: reducing words to their base forms, such as "eating" to "eat" and "healthier" to "health".
3. Stopwords removal: removing irrelevant words within the context, for example, reducing the sentence "The dog barks at the mailman" to the words "dog", "barks", and "mailman".
In cases 1 and 2 above, the pre-processing techniques reduced the meaning of each word to its core information so that the relationship to other words becomes more visible. In case 3, the composition of the remaining words still draws a similar picture.
Meeting ExpectationsAfter breaking down the pre-processing part, you probably can already imagine where this is going. How does the relationship between the symbols and the objects they are pointing to tell us about our expectations? In application, a data set consisting of "bird", "dog", "cat", and "Ludwig is happy" would be such a set of symbols.
Reviewing Text In Data Pipelines For Fidelity To Original MeaningWith this in mind, let me show you two clear expectations one could have for the processed data set, if the intent is to leave the meaning unchanged.
Of course, if I already checked that the expectation in case 2 has not been met, you could argue it is unnecessary even to check case 1. Now, where could pre-processing go wrong, using more concrete examples?
So we have to be careful when certain processing steps change the object referred to, or the original meaning. If this ambiguity is undesired, most of the time, we must review the pre-processing steps used by developers and see if they create matching the output in the intended way.
For example, the Wikipedia comments section has long been known to be a breeding ground for hostility against women. An LLM that is trained on such a data set might very well pick up this hostility and present a false image of women.
Another example is when we expect balanced weighting of input from people in different demographic groups. Imagine a chatbot trained on blog posts and comments that contain many more texts from 20-year-olds than from any other age group. Depending on the purpose of the model, this will distort your output if you expect it to use only "age neutral" expressions.
Making testing a little more dynamic, you could schedule regular statistical tests in this area, whether they be manual or automated. To reduce age bias, for example, the developer could be warned every time the processed data contains too much offensive language or slang, or other potential signals of the writer's age.
We can find a more pragmatic approach to language and meaning in Wittgenstein's later works, such as Philosophical Investigations. In the last section, you saw the consequences of defining meaning somewhere between the word and the object you are referring to. In the later work of Wittgenstein, he posited that the meaning of a word exists simply in the social context it is used. This replaces pointing to some object that might be as well your own arbitrary idea of it.
The language is meant to serve for communication between a builder A and an assistant B. A is building with building-stones: there are blocks, pillars, slabs and beams. B has to pass the stones, and that in the order in which A needs them. For this purpose they use a language consisting of the words "block", "pillar", "slab", "beam". A calls them out;—B brings the stone which he has learnt to bring at such-and-such a call.
- Wittgenstein [4]
This concept is perhaps the most significant of Wittgenstein's with regard to language processing. The assistant learns to bring a pillar when the builder shouts "pillar" even though the assistant might still have no idea what a pillar actually is. He learns that in the context of this social interaction, "pillar" is some speech sound that signals him to act in a certain way. So "pillar" is not referring to any concrete object out there. It is just a social contract that helps us to communicate.
Modelling Meaning As A Public EventWhat does it mean now for AI? I want to avoid going into the technical depths of machine learning models (particularly neural networks) here. Instead, we are going to link the previous example directly to the observable behaviour of such models.
The model itself doesn't actually understand the input. It just learnt how to act on it by giving the corresponding output. Like our assistant in Wittgenstein's example, the meaning of an input for the model is reacting with the most probable output. The construction and maintenance of meaning becomes a public event with no personal and private idea we might have about it.
Word embeddings are a good example of this. Imagine having the words "queen", "king", "apple", and "banana". How could you capture their meaning in a data set? Both the "The king is ruling over the country", as well as "The queen is ruling over the country," are possible sentences that mean something to most people. However, "the apple is ruling over the country" would be a sentence we would not encounter.
In the examples above, we didn't even define the words "queen" and "king". There was no need. Nevertheless, they share some meaning since they fit into the same sentence, while "apple" does not. Word embeddings simulate such similarities of meaning into these vector spaces as in the image. If we simply measure the sentences a word occurs in, instead of thinking about the word itself, the construction of meaning becomes a public event as in Wittgenstein's example.
Reviewing The Data, With ExamplesSo, when you are familiar with these concepts, it changes the perspective you need as you review the quality of the data output. Since the context is what matters now, you can double-check examples in the data set. In the example above, is "queen" closer to "king" than to "apple"? If not, what is the model doing with regard to defining the context for its output not to be as expected? Or is it indeed just the data set showing a distorted image of "queen"? For the latter, you can also return to the last section, checking with 'early' Wittgenstein why the meaning of the context itself seems skewed.
At this point, you might think "Well, that seems a little impractical if I have to review 20,000 different words in such a way". Of course, if you're working with bigger data sets, you can at least focus on a subset. A good way to start is by thinking of abstract categories, such as words related to time or belief. In Wittgenstein's example, these categories also get rather complicated. Believing in some value is not something I can concretely point to in the physical world, so we can easily misunderstand it in any social context as well as in your model.
Secondly, it is often such abstract words that are very prone to bias and will skew your model's output. Abstract words occur especially in the contexts of ethics and aesthetics. What is good or beautiful is usually just a mere opinion of the person saying it. So the model won't capture any sense of the word. It will probably just reproduce mere opinions in the data set. After all, most of the time, we talk about something being pretty and not about prettiness itself. The model will then regard the same thing as pretty.
Checking The ParametersHow do we define what the context of the word even is? Word embeddings usually look to the word's left and right, but the number of words to consider can vary quite a bit. So, the size of the context is something to consider. Also, does the context have to be a single sentence? Maybe the document where you find the word also matters since usage can differ, for example between prose and poetry.
There is even more meta-information you should consider beyond what you can see on your screen, like sentences or documents. If I write something with sarcasm, then you won't get far by just looking at single words. Or perhaps the expression is accompanied by a performance of action as in Wittgenstein's example. If I say "We are going to the cinema", can the sentence be understood in the same way if we are not actually going right now? To come back to our original hypothesis, the purpose of the model defines the expectations we should have of it. So, it could also be your responsibility as a tester to review if the design and testing parameters fit the intention, and add meta-descriptions accordingly.
Statistical TestingThe last point I'd like to make is about the limits of machine learning models when we are talking about Wittgenstein's philosophy. We define words only through the context in which we use the word. But the problem we face then is that no data set could ever contain every possible context where the word could make sense. In practice, it is impossible to save infinite amounts of information in a database.
At the risk of repeating myself, here again, I'm emphasising the expectations we ought to have depending on your intended use of the model. The contexts we choose to be in the data set should match the purpose of the data set. For example, one has different expectations of a translator than of a grammar checker. While we should expect the former to preserve meaning as much as possible, the latter must focus especially on structural rules of the language you are working with.
Finally, to create appropriate statistical tests, you can always do a web search for the test sets. A lot of them have already been created for nearly every purpose and can often fulfil the role of gold standards. Best of all, many are openly accessible. An interesting test set that I can recommend to check the representation of meaning in the model is called the Google analogy test set. Mixing different test sets will help you tailor the test process depending on the expectations you have on the behaviour of language.
Why Should We Even Care About Philosophy?When we want to test large language models, our expectations of language always influence us. I hope this article was able to demonstrate why it is important to ask questions about the underlying principles. For me personally, Wittgenstein is a good entry point to language philosophy, since he has a very analytical perspective matching IT contexts nicely.
In general, philosophy can also help us to re-discover the wonder and fascination of the everyday things we take for granted. It can remind us to look deeper and to see the world in a new light. Of course, my sketch here might not be the only solution, but changing our perspective on things leads to a more meaningful engagement with software testing.
Sources[1] Hirst, Graeme. 1997. Context as a spurious concept. arXiv preprint cmp-lg/9712003.
[2] Sandra Luksic. 2020. Wittgenstein, natural language processing, and ethics of technology. Master's thesis, Duke University, 4. Department of Philosophy.
[3] Alfred Nordmann. 2002. Another new wittgenstein: The scientific and engineering background of the tractatus. Perspectives on Science, 10(3):356–384.
[4] Wittgenstein, Ludwig. 2010. Philosophical investigations. John Wiley & Sons.
[5] Wittgenstein, Ludwig. 2013. Tractatus logico-philosophicus. Routledge.
For Further InformationRise of the Guardians: Testing Machine Learning Algorithms 101, Patrick Prill
Technical Risk Analysis For AI Systems, Bill Matthews
Ludwig Wittgenstein, Stanford Encyclopedia of Philosophy
Top 10 Programming Languages For AI And Natural Language Processing
In this article, we'll discuss the top 10 programming languages for AI and Natural Language Processing. You can skip our detailed analysis of global market trends for NLP and AI development and trending programming languages for AI development and go directly to the Top 5 Programming Languages for AI and Natural Language Processing.
We have seen a recent boom in the fields of Artificial Intelligence (AI) and Natural Language Processing (NLP). Revolutionary tools such as ChatGPT and DALL-E 2 have set new standards for NLP capabilities. These tools are harnessing the power of language processing to store information and provide detailed responses to inputs.
In fact, according to research by Fortune Business Insights, the global market size for Natural Language Processing (NLP) is expected to witness significant growth. The market is projected to expand from $24.10 billion in 2023 to $112.28 billion by 2030, exhibiting a robust compound annual growth rate (CAGR) of 24.6%. This indicates a promising outlook for the NLP market, driven by the increasing demand for advanced language processing solutions across various industries.
With the presence of major industry players, North America is anticipated to dominate the market share of natural language processing. In 2021, the market in North America already accounted for a significant value of USD 7.82 billion, and it is poised to capture a substantial portion of the global market share in the forthcoming years. The region's strong position further reinforces its leadership in driving advancements and adoption of natural language processing technologies.
As the demand for AI and NLP continues to soar, the question arises: which programming languages are best suited for AI development? When it comes to AI programming languages, Python emerges as the go-to choice for both beginners and seasoned developers. Python's simplicity, readability, and extensive libraries make it the perfect tool for building AI applications. In addition, Python allows easy scaling of large machine learning models. Python, along with Lisp, Java, C++, and R, remains among the most popular programming languages in the AI development landscape.
The dominance of Python is further reinforced by the job market, where employers increasingly seek Python language skills. According to TIOBE Programming Community index, Python, SQL, and Java top the list of in-demand programming skills, with Python securing the first spot. With its versatility and ease of use, Python finds applications in various domains, including app and website development, as well as business process automation.
While the utilization of NLP and AI has become imperative for businesses across industries, some companies such as Microsoft Corporation (NASDAQ:MSFT), Amazon.Com, Inc. (NASDAQ:AMZN), and Alphabet Inc. (NASDAQ:GOOG) have played a crucial role in driving recent advancements in these technologies.
Notably, Microsoft Corporation (NASDAQ:MSFT)'s significant investment of $10 billion in OpenAI, the startup behind ChatGPT and DALL-E 2, has made waves in the AI and NLP landscape. These tools have not only transformed the technological landscape but have also brought AI and NLP innovations to the general public in exciting new ways.
Also, Microsoft Corporation (NASDAQ:MSFT)'s Azure, as the exclusive cloud provider for ChatGPT, offers a wide range of services related to NLP. These include sentiment analysis, text classification, text summarization, and entailment services.
The significance of AI and NLP is palpable at Amazon.Com, Inc. (NASDAQ:AMZN) as well. The widely recognized Alexa device, capable of playing your favorite song or providing product recommendations, exemplifies AI and NLP in action. Additionally, Amazon.Com, Inc. (NASDAQ:AMZN)'s Amazon Web Services (AWS) provides cloud storage solutions, enabling businesses to complete their digital transformations.
The impact of AI and the recent surge in generative AI extends beyond Google's homegrown products, as parent company Alphabet Inc. (NASDAQ:GOOG) is actively investing in startups. Alphabet Inc. (NASDAQ:GOOG)'s venture capital arm, CapitalG, recently led a $100 million investment in corporate data firm AlphaSense, valuing the company at $1.8 billion.
So, if you are curious to discover the top programming languages for AI and NLPs, keep reading and delve into the realm of these exciting technologies.
Our Methodology:To rank the top 10 programming languages for deep learning and NLPs, we conducted extensive research to identify commonly used languages in these fields, considering factors such as community support, performance, libraries, ease of use, scalability, and industry adoption. We collected relevant data and evaluated each language on these criteria, assigning scores on a scale of 1 to 5. Higher scores were given to languages demonstrating more robust performance and broader usage in AI and NLP development. We sorted the list in ascending order of the best programming languages for machine learning applications.
Here is the list of the top 10 programming languages for AI and Natural Language Processing.
10. RustPerformance Level: 3.5
Rust, known for its high performance, speed, and a strong focus on security, has emerged as a preferred language for AI and NLP development. Offering memory safety and avoiding the need for garbage collection, Rust has garnered popularity among developers seeking to create efficient and secure software. With a syntax comparable to C++, Rust provides a powerful and expressive programming environment. Notably, renowned systems including Dropbox, Yelp, Firefox, Azure, Polkadot, Cloudflare, npm, and Discord rely on Rust as their backend programming language. Due to its memory safety, speed, and ease of expression, Rust is considered an ideal choice for developing AI and leveraging it in scientific computing applications.
9. PrologPerformance Level: 3.7
Prolog is a logic programming language. It is mainly used to develop logic-based artificial intelligence applications. Prolog's declarative nature and emphasis on logic make it particularly well-suited for tasks that involve knowledge representation, reasoning, and rule-based systems. Its ability to efficiently handle symbolic computations and pattern matching sets it apart in the AI and NLP domains. Prolog's built-in backtracking mechanism allows for elegant problem-solving approaches. With Prolog, developers can focus on specifying the problem's logic rather than the algorithmic details. These characteristics make Prolog an appealing choice for AI applications that involve complex inference, knowledge-based systems, and natural language processing tasks.
8. WolframPerformance Level: 3.8
Wolfram programming language is known for its fast and powerful processing capabilities. In the realm of AI and NLP, Wolfram offers extensive capabilities including 6,000 built-in functions for symbolic computation, functional programming, and rule-based programming. It also excels at handling complex mathematical operations and lengthy natural language processing tasks. Moreover, Wolfram seamlessly integrates with arbitrary data and structures, further enhancing its utility in AI and NLP applications. Developers rely on Wolfram for its robust computational abilities and its aptitude for executing sophisticated mathematical operations and language processing functions.
7. HaskellPerformance Level: 4
Haskell prioritizes safety and speed which makes it well-suited for machine learning applications. While Haskell has gained traction in academia for its support of embedded, domain-specific languages crucial to AI research, tech giants like Microsoft Corporation (NASDAQ:MSFT) and Meta Platforms, Inc. (NASDAQ:META) have also utilized Haskell for creating frameworks to manage structured data and combat malware.
Haskell's HLearn library offers deep learning support through its Tensorflow binding and algorithmic implementations for machine learning. Haskell shines in projects involving abstract mathematics and probabilistic programming, empowering users to design highly expressive algorithms without compromising efficiency. Haskell's versatility and fault-tolerant capabilities make it a secure programming language for AI applications, ensuring robustness in the face of failures.
6. LispPerformance Level: 4.3
Lisp, one of the pioneering programming languages for AI, has a long-standing history and remains relevant today. Developed in 1958, Lisp derived its name from 'List Processing,' reflecting its initial application. By 1962, Lisp had evolved to address artificial intelligence challenges, solidifying its position in the field. While Lisp is still capable of producing high-quality software, its complex syntax and costly libraries have made it less favored among developers. However, Lisp remains valuable for specific AI projects, including rapid prototyping, dynamic object creation, and the ability to execute data structures as programs.
Click to continue reading and see the Top 5 Programming Languages for AI and Natural Language Processing.
Suggested Articles:
Disclosure: None. Top 10 Programming Languages for AI and Natural Language Processing is originally published on Insider Monkey.
Natural Language Processing Used To Extract Social Determinants Of Health
Information on the nonmedical factors that influence health outcomes, known as social determinants of health, is often collected at medical appointments. But this information is frequently recorded as text within the clinical notes written by physicians, nurses, social workers, and therapists.
Researchers from Regenstrief Institute and Indiana University Fairbanks School of Public Health recently published one of the first studies in which natural language processing was applied to social determinants of health. The researchers developed three new natural language processing algorithms to successfully extract information from text data related to housing challenges, financial stability and employment status from electronic health records.
"Health and well-being are not just about medical care. Mostly, they are about our behaviors, our environment, our social connections," said Regenstrief Institute Research Scientist and Fairbanks School of Public Health faculty member Joshua Vest, PhD, who led the study. "More and more healthcare organizations are having to deal with social determinants because it is factors like financial resources, housing, and employment status that really drive costs that make people unhealthy. The challenge for health care organizations is effectively measuring and identifying patients with social risks so that they can intervene."
"Our work helps advance the field in both application and methodology. Natural language processing has been applied to numerous conditions in the past, but this is one of the first papers that applies it to social determinants of health. We demonstrated that a relatively simplistic natural language processing approach could effectively measure social determinants instead of using of more sophisticated deep learning and neural network models. These later models are powerful but complex, difficult to implement, and require a lot of expertise, which many health systems don't have."
We purposely designed a system that could run in the background, read all the notes and create tags or indicators that says this patient's record contains data suggesting possible concern about a social indicator related to health. Our overall goal is to measure social determinants well enough for researchers to develop risk models and for clinicians and healthcare systems to be able to use these factors – housing challenges, financial security and employment status – in routine practice to help individuals and to provide a better understanding of the overall characteristics and needs of their patient population."
Joshua Vest, PhD, Regenstrief Institute Research Scientist and Fairbanks School of Public Health faculty member
Information indicating social needs can be extracted for many types of data in an electronic medical record, including information on patient occupation, health insurance coverage, marital status, size of household, address (low versus high crime area) and frequency of address changes.
Previously, Dr. Vest and colleagues, including Regenstrief Institute Vice President for Data and Analytics Shaun Grannis, M.D., created an app they named Uppstroms, Swedish for upstream, and successfully demonstrated that it could use structured data to predict patients in need of a referral to a social service such as a nutritionist.
Source:
Journal reference:
Allen, K. S., et al. (2023) Natural language processing-driven state machines to extract social factors from unstructured clinical documentation. JAMIA Open. Doi.Org/10.1093/jamiaopen/ooad024.
Comments
Post a Comment