代做CSOCMP5328、代寫Python編程設計 - 合肥網

<em id="rw4ev"></em>

<tr id="rw4ev"></tr>

<nav id="rw4ev"></nav>

<strike id="rw4ev"><pre id="rw4ev"></pre></strike>

合肥生活安徽新聞合肥交通合肥房產生活服務合肥教育合肥招聘合肥旅游文化藝術合肥美食合肥地圖合肥社保合肥醫院企業服務合肥法律

健康合肥汽車體育文旅企業動態企業推廣網站推廣外鏈推廣

代做CSOCMP5328、代寫Python編程設計

時間：2024-05-19 來源：合肥網hfw.cc 作者：hfw.cc 我要糾錯

CSOCMP5**8 - Advanced Machine Learning
Bias and Fairness in Large Language Models (LLMs)

This is a group assignment, 2 to 3 students only. This is NOT an individual assignment. It is worth
25% of your total mark.

1. Introduction
Generative AI models have garnered significant attention and adoption in various domains due to
their remarkable output quality. Nevertheless, these models, reliant on massive, internet-sourced
datasets, exhibit vulnerabilities that sparked a debate on important ethical concerns, especially
around fairness, pertaining to the amplification of human biases and a potential decline in
trustworthiness.

This assignment aims to investigate methods for bias mitigation within generative AI models and
provide your own method to mitigate the bias in the LLMs. While there are two main critical areas:
Text-to-Text and Text-to-Image where fairness is paramount, our focus in this assignment is
specifically on the Text-to-Text problem.
● Text-to-Text using Large Language Models (LLMs): This area encompasses prominent
language models such as Llama-2, BERT, T5, GPT-2/3, and Chat-GPT, and examines the
potential for these models to generate biased textual content and its implications.
1.1 Common biased categories
To contextualise our investigation, we have identified several common categories of bias that
may manifest within generative AI models:
● Gender and Occupations: One significant aspect involves exploring biases related to
gender disparities in various professions. By analysing the output of generative models, we
can discern whether these models tend to associate specific careers more with one gender
over another, thus potentially perpetuating occupational stereotypes, for example:
○ Text-to-Text: GPT-2 may generate text that reinforces traditional gender
stereotypes. For example, it might associate caregiving with women and leadership
with men, perpetuating societal biases. Example: "She is a nurturing mother,
always putting her family first."
○ Text-to-Image: The results generated by Stable Diffusion for the prompt “A photo
of a firefighter.”

● Race / Ethnicity: Another critical dimension involves assessing biases related to race and
ethnicity:
○ Text-to-Text: GPT-2 may generate text that perpetuates racial stereotypes or
generalisations about specific racial or ethnic groups, for example: "Asian people
are naturally good at math." or the model may generate content that oversimplifies
or misrepresents the cultures and traditions of certain racial or ethnic groups. for
example: "All Latinos are passionate dancers."
○ Text-to-Image: The bias results for “intelligent person” using Image Search
Engines.

Addressing bias and fairness in generative AI represents a complex and ongoing challenge.
Researchers and developers are actively engaged in devising a range of techniques aimed at bias
detection and mitigation. These approaches include the diversification of training data sources, the
development of ethical guidelines for AI development, and the creation of algorithms designed
explicitly to identify and rectify bias within AI-generated outputs.
1.2 Safety
Generative AI is used in intentionally harmful ways. This includes misusing generative AI to
generate child sexual exploitation and abuse material based on images of children, or generating
sexual content that appears to show a real adult and then blackmailing them by threatening to
distribute it over the internet. Generative AI can also be used to manipulate and abuse people by
impersonating human conversation convincingly and responding in a highly personalised manner,
often resembling genuine human responses.
Note: The resultant figures from Stable Diffusion are only presented to demonstrate the bias. This
assignment is only for "text-based bias and fairness" in LLMs.

2. A Guide to Using the Datasets
To effectively investigate and assess bias within generative AI models for Text-to-Text, it is crucial
to select appropriate datasets that reflect real-world scenarios and challenges. Depending on your
chosen focus, you may need to find specific datasets for your area of investigation e.g., healthcare,
sports, entertainment datasets etc. We provide some examples below however you are free to choose any dataset not listed. There are several datasets used for LLM bias evaluation [1], you
may refer to this link for more information: https://github.com/i-gallegos/Fair-LLM-Benchmark.
Those datasets are only used for evaluation, do not train your model with these datasets.

Depending on your research objectives, select training datasets that align with your area of
investigation.
● Access the chosen datasets through official sources, research papers, or relevant
repositories.
● Download the training dataset (s) to your local environment. Ensure that you adhere to any
licensing or usage terms associated with the dataset(s). Depending on the debiasing
techniques employed, retraining the model may be necessary. Commonly utilised datasets
for training LLMs such as Common Crawl, Wikipedia, BookCorpus, PubMed, arXiv,
ImageNet, COCO, VQA, Flickr30k, etc.
● Pre-process the dataset as necessary for compatibility with your chosen de-biasing (i.e.,
enabling fairness) methods in generative AI model. Consider factors like label imbalance
among various demographic groups in the training data, as this can lead to bias. One
common method for addressing bias is counterfactual data augmentation (CDA) [1] to
balance labels. Additionally, other pre-processing techniques involve adjusting harmful
information in the data or eliminating potentially biased texts. Identify and handle harmful
text subsets using different methods to ensure a fairer training corpus.
● Integrate the pre-processed dataset(s) into your code for training and evaluation. Ensure
that you have the appropriate data loading and pre-processing routines in place to work
seamlessly with generative AI models.

Remember that data pre-processing and formatting are crucial steps in ensuring that the datasets
are ready for input into your generative AI models. Additionally, make sure to document your
dataset selection and pre-processing steps thoroughly in your research report for transparency and
reproducibility.

3. Performance Evaluations
Most fairness metrics for LLMs can be categorised by what they use from the model such as the
embeddings, probabilities, or generated text, including:
● Embedding-based metrics: Using the dense vector representations to measure bias, which
are typically contextual sentence embeddings.
● Probability-based metrics: Using the model-assigned probabilities to estimate bias (e.g., to
score text pairs or answer multiple-choice questions).
● Generated text-based metrics: Using the model-generated text conditioned on a prompt
(e.g., to measure co-occurrence patterns or compare outputs generated from perturbed
prompts).

4. Tasks
Your main tasks are:

● Research: Conduct in-depth research to identify various methods for addressing bias in
Generative AI. Ensure you understand the theoretical foundations and practical
implementation of these methods. Provide comprehensive comparison of various methods
based on the conducted evaluations and discuss their contributions, evaluation methods,
strengths, and weaknesses (this will help in the Related Work section of the report).

● Proposed Mathematical Model:
○ Chose a language model such as Llama-2, BERT, T5, GPT-2/3, and Chat-GPT you
would like to remove the bias. Write mathematical model for your proposed
approach, represent training datasets as a database or feature sets etc., preprocessing
steps that you have taken on the training datasets, the objective and
optimisation method that you employed, training model using LLM, and evaluation
metrics to evaluate your model. Write comprehensive table to show all the notations
along with their descriptions.
○ Write algorithms to show all the steps of the proposed approach, including system
initialisation, training/testing, bias evaluations, results evolutions, or any other
steps that show the implementation of your proposed approach.
○ Show schematic representation of your proposed approach.
● Code Development:
○ Implement the selected bias mitigation methods, based on the proposed
mathematical model.
○ Train the model using selected LLM with the pre-processed dataset (if needed).
○ Evaluate the bias, show experimental evaluations of various metrics, generate their
corresponding figures.
○ The code (including interfacing for training model using LLM and results
evaluations) must be written in Python 3. You are allowed to use any external
libraries for performance comparisons; however, you need to provide details on
how the libraries were setup and how evaluation metrics were used, in the Appendix
section.

● Evaluation:
○ Perform the chosen model before applying debiasing techniques on evaluation
datasets and show if the bias exists via various prompts, these results are termed as
the baseline.
○ Pre-process the dataset and train the model using LLM using your proposed
method. Evaluate the performance of the trained model via various prompts to
demonstrate that you have addressed the bias. Note that, some debiasing techniques
may not require retraining the model.
○ Compare the performance of proposed method with the baseline.
○ Evaluate other performance evaluation metrics, e.g., utility, training time, average,
St. Dev etc. Note that some of the evaluation metrics might not be applicable in
your proposed scenario, hence, you must actively think of various evaluation
metrics to determine the applicability of your model; comprehensive literature survey will help you find how authors evaluated the bias and enabled fairness of
generative AI models.
○ Important: Please note that this is our understanding of how to carry out this study
and evaluations i.e., show bias of chosen model via prompts à apply chosen
debiasing technique (for example, pre-process the dataset (to remove imbalance
labels and re-train model with pre-processed dataset) à via prompts, show that you
have addressed the bias à compare baseline with proposed approach. If you think
that this might not work, you need to come up with other techniques.

● Conclude:
○ Conclude your findings and show the strengths and weaknesses of your proposed
approach.
○ Provide hypothetical comparison of your approach with other approaches in the
literature. This comparison could be based on various performance metrics.
○ Provide future research directions about how to mitigate those weaknesses.
○ Provide comprehensive directions on how your proposed model could be
generalised and applicable for various application scenarios e.g., social media
applications, stock markets, health or sports analytics etc.

Note: Above steps are written with quite details. If you still have any ambiguity about those steps
or implementation/technical questions or understanding of the problem scenario, then please do
your own research, share your findings on the Ed so that other students could also get idea of how
to deal with specific problem steps. Furthermore, please also post your concerns/questions no Ed
under the “Assignment 2” thread, our teaching team will be happy to share their experience and
suggestions. Please note that this is an open research assignment, use your own creativity and come
up with the understanding of this problem scenario and solution.

4.1 Report
The report should be organised similar to research papers, and should contain at least the following
sections:

Abstract:
• Clearly introduces the topic scenario and its significance.
• Provides a concise summary of the proposed evaluation method.
• Provide the results from various evaluation metrics.
• Conclude your contributions and discuss its applicability in the real-world scenario.

Introduction:
• Clearly introduces the problem of bias in generative AI and its importance.
• Provides a clear and detailed overview of the proposed methods.
• Write contributions in detail e.g., pre-processing, experimental setup, mathematical
model, proposed evaluation method and metrics, various steps to achieve evaluate your
results.
• Provide discussion on the key results and show the organisation of your report at the end
of this section.
Related Work:
• Provides a comprehensive review of related debiasing and fairness methods.
• Discusses the advantages and disadvantages of the reviewed methods in the literature.
• Demonstrates understanding of the existing literature.
• Provide a summarised table of the existing works and show their contributions, evaluation
method, strengths, and weaknesses of existing work.

Proposed Method:
• Explains the theoretical foundations of the proposed solution effectively.
• Describes the details of debiasing methods clearly, including the objective function.
• Presents the algorithmic representation of the proposed solution comprehensively.
• Show schematic representation of your proposed approach.

Experiments/Evaluations:
• Provides a clear description of the experimental setup, including datasets, algorithm
evaluations, and metrics.
• Presents experimental results effectively, with appropriate figures.
• Conducts a thorough analysis and comparison of baseline and proposed method.
• Provides detailed insights on the results.

Conclusion:
• Effectively summarises the methods and results.
• Provides valuable insights or suggestions for future work.
• Provide strengths and weaknesses of your work, furthermore, provide future directions.

References:
• Lists all references, cited in the report.
• Formats all references consistently and correctly.

Appendix:
• Provide instructions on how to run your code.
• Provide additional/supporting figures or experimental evaluations.

Note: Please follow the provided latex format for the report on Canvas.

5. Submission guidelines
1. Go to Canvas and upload the following files/folders compressed together as a zip file.
● Report (a PDF file)
The report should include all member’s details (student IDs and names).
● Code (a folder):
○ Algorithm (a sub-folder): Your code (could be multiple files or a project) ○ Input data (a sub-folder) Empty. Please do NOT include the dataset in the zip file
as they are large. Please provide detailed instructions on how the datasets are used
and how to download them. We will copy the dataset to the input folder when we
test the code.
2. A plagiarism checker will be used, both for code and report.
3. A penalty of MINUS 20 percent marks (−20%) per day after the due date. The maximum
delay is 5 (five) days, after that assignments will not be accepted.

Note: Only one student needs to submit the zip file which must be renamed as student ID numbers
of all group members separated by underscores, which should contain all the relevant files and
report. E.g., “xxxxxxxx_xxxxxxxx_xxxxxxxx.zip”. Please write names and email addresses of
each member in the report.

Example References:
1. Bias and Fairness in Large Language Models: A Survey. Isabel O. Gallegos, Ryan A.
Rossi, Joe Barrow, Md Mehrab Tanjim, Sungchul Kim, Franck Dernoncourt, Tong Yu,
Ruiyi Zhang, Nesreen K. Ahmed. https://arxiv.org/abs/2309.00770
2. A Survey on Fairness in Large Language Models. Yingji Li, Mengnan Du, Rui Song, Xin
Wang, Ying Wang. https://arxiv.org/abs/2308.10149
3. Fair Diffusion: Instructing Text-to-Image Generation Models on Fairness. Felix Friedrich,
Manuel Brack, Lukas Struppek, Dominik Hintersdorf, Patrick Schramowski, Sasha
Luccioni, Kristian Kersting. https://arxiv.org/abs/2302.10893
4. Stable Bias: Analyzing Societal Representations in Diffusion Models. Alexandra Sasha
Luccioni, Christopher Akiki, Margaret Mitchell, Yacine Jernite.
https://arxiv.org/abs/2303.11408

6. Marking Rubrics
Criterion Marks Comments

Coding (30 Marks):
• Coding will be run to see whether it works properly and
produces the figures and all evaluations demonstrated in
the report.

Abstract (5 Marks):
• Clearly introduces the topic scenario and its
significance. (1 Marks)
• Provides a concise summary of the proposed evaluation
method. (2 Marks)
• Provide the results from various evaluation metrics. (1
Marks)
• Conclude your contributions and discuss its
applicability in the real-world scenario. (1 Marks)

Introduction (10 Marks):
• Clearly introduces the problem of bias in generative AI
and its importance. (3 Marks)
• Provides a clear and detailed overview of the proposed
methods. (3 Marks)
• Write contributions in detail e.g., pre-processing,
experimental setup, mathematical model, proposed
evaluation method and metrics, various steps to achieve
evaluate your results. (2 Marks)
• Provide discussion on the key results and show the
organisation of your report at the end of this section. (2
Marks)

Related Work (10 Marks):
• Provides a comprehensive review of related debiasing
and fairness methods. (3 Marks)
• Discusses the advantages and disadvantages of the
reviewed methods in the literature. (3 Marks)
• Demonstrates understanding of the existing literature. (2
Marks)
• Provide a summarised table of the existing works and
show their contributions, evaluation method, strengths,
and weaknesses of existing work. (2 Marks)

Proposed Method (20 Marks):
• Explains the theoretical foundations of the proposed
solution effectively. (7 Marks)
• Describes the details of debiasing methods clearly,
including the objective function. (4 Marks)
• Presents the algorithmic representation of the proposed
solution comprehensively. (7 Marks)
• Shows schematic representation of proposed approach.
(2 Marks)

Experiments/Evaluations (20 Marks):
• Provides a clear description of the experimental setup,
including datasets, algorithm evaluations, and metrics.
(7 Marks)
• Presents experimental results effectively, with
appropriate figures. (7 Marks)
• Conducts a thorough analysis and comparison of
baseline and proposed method. (4 Marks)
• Provides detailed insights on the results. (4 Marks)

Conclusion (5 Marks):
• Effectively summarises the methods and results. (1
Marks)
• Provides valuable insights or suggestions for future
work. (2 Marks)
• Provide strengths and weaknesses of your work,
furthermore, provide future directions. (2 Marks)

References:
• Lists all references, cited in the report.
• Formats all references consistently and correctly.

Overall Presentation (10 Marks):
• Maintains a clear and logical structure throughout the
report. (5 Marks)
• Demonstrates excellent writing quality, including clarity
and coherence. (3 Marks)
• Adheres to formatting and citation guidelines
consistently. (2 Marks)

Total: 100 Marks

請加QQ：99515681 郵箱：99515681@qq.com WX：codinghelp

掃一掃在手機打開當前頁

上一篇:菲律賓移民北美的條件(移民材料是什么)

下一篇:代做CSC 4120、代寫Python程序語言

注：此文是出于傳遞更多信息之目的。所轉載的內容，其版權均由原作者和資料提供方所擁有！若侵犯了您的合法權益，請聯系我們，將及時更正、刪除，謝謝。

無相關信息

合肥生活資訊

·合肥汽車客運網上售票

·合肥汽車客運

·合肥校外培訓機構“白名單”

·合肥市人民政府征兵辦公室電話

·合肥市中小學教師招聘考試網

·合肥市醫療保險管理中心電話查詢（合肥市醫保

·2023合肥市住房公積金查詢指南

·合肥市住房租賃交易服務平臺（官方網站）

·合肥市消防救援支隊聯系電話

·合肥露營地推薦給你！合肥有哪些露營地？

·2023年合肥具備學歷教育辦學資質的中等職業學

·合肥淮河路步行街

·廬江縣各單位常用電話號碼

·合肥市廬江縣湯池鎮百花村

·安徽省美術館

·安徽創新館 - 安徽科技大市場

·安徽省2023年普通高等學校體育專業課統一考試

·安徽肥東管灣國家濕地公園

·安徽廬陽董鋪國家濕地公園

·肥東大劇院

·廬陽區文化館

·安徽這70個村落擬列入中國傳統村落名錄

·合肥市非機動車安全管理條例，非機動車這些行

·合肥信易貸平臺，為中小微企業融資

·合肥市公管局

·安徽省征地信息公開平臺

·安徽省教育招生考試院，安徽高招咨詢熱線開通

·合肥最新義務教育學區劃分

·成績錄取查詢

·合肥市區2022年高考各分考區考點安排

·合肥交警民意熱線開通

·安徽學習技能可獲補貼

·合肥市各縣區救助站聯系電話地址

·合肥市婚姻登記機構電話地址

·合肥城鄉居民最低生活保障標準和特困人員救助

·合肥熱電，合肥供暖

·合肥24小時核酸檢測服務機構名單，合肥核酸檢

·合肥城鄉居民基本養老保險個人參保信息查詢

·2022年合肥市區中考報名方案發布

·2022屆安徽畢業生求職創業補貼1500元發放申請

·合肥市人社部門聯系電話

·合肥市生育相關服務指南（2021年）

·合肥市公共就業人才服務

·合肥市2021年義務教育招生入學政策

·合肥市2021年中小學幼兒園暑假安排

·合肥教育局各部咨詢電話

·合肥最新展會計劃

·合肥市公共就業人才服務管理中心

·合肥市醫療保障局

·合肥市2021年中小學幼兒園寒假安排

·安徽省政府定價的經營服務性收費目錄清單

·合肥市“互聯網+不動產登記”一體化平臺

·四種合肥通卡要年審

·2020合肥城鄉居民養老保險待遇與繳費標準

·合肥市住房保障和房產管理局

·合肥市殯儀館電話

·合肥招生考試網

·合肥辦理的社保卡業務指南

·合肥市社會保障卡業務經辦窗口地址（人社部門

·合肥市最低工資標準2019

合肥圖文信息

急尋熱仿真分析？代做熱仿真服務+熱設計優化

出評開團工具

挖掘機濾芯提升發動機性能

海信羅馬假日洗衣機亮相AWE 復古美學與現代科技完美結合 — 海信羅馬假日洗衣機亮相AWE 復古美學與現代

合肥機場巴士4號線

合肥機場巴士3號線

合肥機場巴士2號線

合肥機場巴士1號線

推薦信息

欄目更新

熱點信息

·代做CS2810、代寫Python/Java程序

·SEHH2042代做、c/c++程序設計代寫

·SEHH2042代做、代寫c++，Java編程

·COMP3009J代做、代寫Python程序設計

·代寫CS3026、代做Virtual Disk

·ISOM3028代做、Python/c++編程語言代寫

·COMP2011代寫、C++編程設計代做

·代寫ECON0013、代做Python/c++語言程序

·COSC2276代做、C/C++語言程序代寫

·ACS11001代做、 Embedded Systems程序語言代寫

短信驗證碼酒店vi設計 NBA直播幣安下載

關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

Copyright © 2025 hfw.cc Inc. All Rights Reserved. 合肥網 版權所有
ICP備06013414號-3 公安備 42010502001045

成人久久18免费网站入口