Tag: Python

สร้าง RAG pipeline ด้วย langchain ใน 5 ขั้นตอน–ตัวอย่างการสร้างบ๊อตตอบคำถามเกี่ยวกับนโยบาย HR

RAG (Retrieval-Augmented Generation) เป็นเทคนิคที่ช่วยให้ LLM (large language model) ตอบคำถามได้แม่นยำขึ้น และไม่ถูกจำกัดด้วย knowledge cutoff หรือความรู้ที่จำกัดจากตอน train model

RAG ทำงานใน 2 ขั้นตอน:

Retrieve: ดึงเอกสารที่เกี่ยวข้อง
Generate: สร้างคำตอบจากเอกสารที่ได้มา

RAG มีข้อดี 3 ข้อ:

คำตอบมีความแม่นยำมากขึ้น
คำตอบมีความเกี่ยวข้องกับคำถามมากขึ้น
ช่วยอัปเดตความรู้ให้กับ LLM ได้โดยไม่ต้อง train model ใหม่

ในบทความนี้ เราจะมาดูวิธีการสร้าง RAG pipeline ด้วย langchain ซึ่งเป็น framework ในการพัฒนาแอปพลิเคชัน LLM กัน

ถ้าพร้อมแล้ว ไปเริ่มกันเลย

🔆 High-Level View

เราใช้ langchain สร้าง RAG pipeline ได้ใน 5 ขั้นตอน

Load documents
Split text
Embed and store chunks
Create a retriever
Generate a response

เราไปดูการสร้าง RAG pipeline กับตัวอย่างบ็อตตอบคำถามเกี่ยวกับนโยบาย HR เช่น การลาและสวัสดิ กัน

📑 Step 1. Load Documents

ในขั้นแรก เราจะโหลดเอกสารที่เป็นข้อมูลของ RAG pipeline ก่อน

langchain มีหลาย functions สำหรับโหลดเอกสาร เช่น:

Function	Document
`TextLoader()`	Text file
`UnstructuredMarkdownLoader()`	Markdown file
`CSVLoader()`	CSV file
`JSONLoader()`	JSON file
`PyPDFLoader()`	PDF file
`DirectoryLoader()`	ไฟล์จากในโฟลเดอร์

ในตัวอย่าง เราจะใช้ DirectoryLoader() เพราะเราเก็บเอกสารไว้ในโฟลเดอร์ชื่อ documents:

			
documents/
├── benefits_policy.txt
├── compensation_policy.txt
├── leave_policy.txt
└── remote_work_policy.txt

		

ตัวอย่างข้อมูลในเอกสาร benefits_policy.txt:

			
DataWise Co. Benefits Policy
Full-time employees receive health insurance after completing probation.
The company provides annual health checkups once per year.
Employees can claim up to 2,000 THB per month for wellness activities such as fitness memberships, yoga classes, or mental health support.
Employees are also eligible for learning support. The company reimburses up to 10,000 THB per year for approved online courses, books, or professional certificates.

		

วิธีใช้ DirectoryLoader():

			
# Import packages
from langchain_community.document_loaders import DirectoryLoader
from langchain_community.document_loaders import TextLoader
# Initialise loader
loader = DirectoryLoader(
    path="documents",
    glob="*.txt",
    loader_cls=TextLoader,
    loader_kwargs={"encoding": "utf-8"}
)
# Load documents
docs = loader.load()

		

การใช้งาน DirectoryLoader():

path = โฟลเดอร์ที่ต้องการโหลด
glob = pattern ชื่อไฟล์ที่ต้องการโหลด (เช่น "*.txt" หมายถึง ไฟล์ที่ลงชื่อด้วย .txt ทั้งหมด)
loader_cls = function ที่จะใช้โหลด (เช่น TextLoader())
loader_kwargs = argument เพิ่มเติมสำหรับ function ที่จะใช้โหลด

เราสามารถดูตัวอย่างเอกสารที่โหลดแล้วได้แบบนี้:

			
# View loaded documents
for doc in docs:
    print("=" * 50)
    print(doc.metadata["source"])
    print("=" * 50)
    print(doc.page_content[:200])

		

ผลลัพธ์:

			
==================================================
documents/remote_work_policy.txt
==================================================
DataWise Co. Remote Work Policy
Employees may work from home up to 2 days per week.
Remote work must be approved by the employee's direct manager.
Employees must be reachable on Slack during core w
==================================================
documents/benefits_policy.txt
==================================================
DataWise Co. Benefits Policy
Full-time employees receive health insurance after completing probation.
The company provides annual health checkups once per year.
Employees can claim up to 2,000 THB 
==================================================
documents/compensation_policy.txt
==================================================
DataWise Co. Compensation Policy
Salary is paid on the last working day of each month.
Performance bonuses are reviewed once per year in December.
Employees may receive an annual salary adjustment 
==================================================
documents/leave_policy.txt
==================================================
DataWise Co. Leave Policy
Full-time employees receive 10 days of annual leave per year after completing probation.
Employees receive 15 days of paid sick leave per year.
Sick leave of 3 consecutive

		

📚 Step 2. Split Text

ในขั้นที่ 2 เราจะแบ่ง text ในเอกสารออกเป็นก้อน ๆ หรือ chunk เพราะการแบ่ง text จะช่วยให้การค้นหาข้อมูลง่ายขึ้น

langchain มี 3 functions หลักในการแบ่ง text:

Function	Method
`CharacterTextSplitter()`	แบ่งตามจำนวน character ที่กำหนด
`TokenTextSplitter()`	แบ่งตามจำนวน token ที่กำหนด
`RecursiveCharacterTextSplitter()`	แบ่งตามย่อหน้า บรรทัด และประโยค

ในตัวอย่าง เราจะใช้ RecursiveCharacterTextSplitter() เพราะเป็นวิธีที่เก็บรักษาความหมายของ text ได้ดีกว่าวิธีอื่น:

วิธีใช้ RecursiveCharacterTextSplitter():

			
# Import package
from langchain_text_splitters import RecursiveCharacterTextSplitter
# Create splitter
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=100
)
# Split documents
chunks = text_splitter.split_documents(docs)

		

ดูตัวอย่าง text ที่แบ่งแล้วได้ตามนี้:

			
# View results
for i, chunk in enumerate(chunks[:5]):
    print(f"Chunk {i+1}")
    print("Source:", chunk.metadata["source"])
    print(chunk.page_content)
    print("-" * 50)

		

ผลลัพธ์:

			
Chunk 1
Source: documents/remote_work_policy.txt
DataWise Co. Remote Work Policy
Employees may work from home up to 2 days per week.
Remote work must be approved by the employee's direct manager.
Employees must be reachable on Slack during core working hours from 10:00 AM to 4:00 PM.
Employees working remotely are responsible for maintaining a stable internet connection and a quiet work environment.
New employees may request remote work only after completing their first month.
--------------------------------------------------
Chunk 2
Source: documents/benefits_policy.txt
DataWise Co. Benefits Policy
Full-time employees receive health insurance after completing probation.
The company provides annual health checkups once per year.
Employees can claim up to 2,000 THB per month for wellness activities such as fitness memberships, yoga classes, or mental health support.
Employees are also eligible for learning support. The company reimburses up to 10,000 THB per year for approved online courses, books, or professional certificates.
--------------------------------------------------
Chunk 3
Source: documents/compensation_policy.txt
DataWise Co. Compensation Policy
Salary is paid on the last working day of each month.
Performance bonuses are reviewed once per year in December.
Employees may receive an annual salary adjustment based on company performance, individual performance, and market benchmarks.
Overtime pay is available only for non-managerial employees and must be approved by a manager before the overtime work begins.
--------------------------------------------------
Chunk 4
Source: documents/leave_policy.txt
DataWise Co. Leave Policy
Full-time employees receive 10 days of annual leave per year after completing probation.
Employees receive 15 days of paid sick leave per year.
Sick leave of 3 consecutive days or more requires a medical certificate.
Employees should submit annual leave requests at least 7 days in advance through the HR system.
Unused annual leave can be carried over for up to 5 days into the next calendar year.
--------------------------------------------------

		

สังเกตว่า text ถูกแบ่งย่อหน้า ทำให้ chunk ที่ได้มีความหมายที่ครบถ้วนในตัวเอง

💾 Step 3. Embed & Store Chunks

ในขั้นที่ 3 เราจะ embed และเก็บข้อมูลลงใน vector database

Embedding คือ การแปลง chunk ให้กลายเป็น vector คือ ชุดตัวเลขที่เป็นตัวแทนของ chunk

ตัวอย่าง chunk:

"Employees can work from home up to two days per week."

ตัวอย่าง vector:

			
[
    0.021,
   -0.184,
    0.736,
    0.094,
   -0.511,
    0.302,
    0.087,
   -0.624
]

		

Vector เป็นสิ่งที่ระบบจะใช้ในการค้นหาเอกสารที่เกี่ยวข้อง โดย vector ที่มีความหมายใกล้เคียงกัน จะมีตัวเลขที่ใกล้เคียงกัน เมื่อเราต้องการหาเอกสาร ระบบจะดึงเอกสารที่มี vector ใกล้เคียงกับคำถามของเราขึ้นมาให้

ใน langchain เราสามารถเลือก model ที่จะใช้ embedding ได้ ในตัวอย่าง เราจะใช้ Gemini กัน:

			
# Import packages
import os
from langchain_google_genai import GoogleGenerativeAIEmbeddings
# Get API key
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY")
# Create embedder
document_embedder = GoogleGenerativeAIEmbeddings(
    model="gemini-embedding-001",
    task_type="retrieval_document",
    google_api_key=GEMINI_API_KEY
)

		

หลังจากได้ embedding model แล้ว เราจะสร้าง vector database เพื่อเก็บ vector โดยในตัวอย่างเราจะใช้ FAISS database:

			
# Import package
from langchain_community.vectorstores import FAISS
# Build vector DB
vectorstore = FAISS.from_documents(
    documents=chunks,
    embedding=document_embedder
)

		

สังเกตว่า เราใส่ document_embedder ไปใน vector database ด้วย เพื่อแปลง chunk เป็น vector และเก็บลงใน database

🔎 Step 4. Create a Retriever

ในขั้นที่ 4 เราจะสร้าง retriever ที่ทำหน้าที่ค้นหา vector โดยใช้ .as_retriever() แบบนี้:

			
# Creater retriever
retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 2}
)

		

เราสามารถทดสอบ retriever เพื่อดูว่า จะได้เอกสารอะไรกลับมา ได้แบบนี้:

			
# Test retriever
question = "Do I need a medical certificate for sick leave?"
relevant_docs = retriever.invoke(question)
for i, doc in enumerate(relevant_docs, start=1):
    print(f"Retrieved chunk {i}")
    print("Source:", doc.metadata["source"])
    print(doc.page_content)
    print("-" * 60)

		

ผลลัพธ์:

			
Retrieved chunk 1
Source: documents/leave_policy.txt
DataWise Co. Leave Policy
Full-time employees receive 10 days of annual leave per year after completing probation.
Employees receive 15 days of paid sick leave per year.
Sick leave of 3 consecutive days or more requires a medical certificate.
Employees should submit annual leave requests at least 7 days in advance through the HR system.
Unused annual leave can be carried over for up to 5 days into the next calendar year.
------------------------------------------------------------
Retrieved chunk 2
Source: documents/compensation_policy.txt
DataWise Co. Compensation Policy
Salary is paid on the last working day of each month.
Performance bonuses are reviewed once per year in December.
Employees may receive an annual salary adjustment based on company performance, individual performance, and market benchmarks.
Overtime pay is available only for non-managerial employees and must be approved by a manager before the overtime work begins.
------------------------------------------------------------

		

🤖 Step 5. Generate a Response

ในขั้นสุดท้าย เราจะให้ LLM สร้างคำตอบโดยใช้ข้อมูลใน vector database

ในตัวอย่างเราจะลองใช้ Gemini ช่วยคิดคำตอบให้กับเรา

เราจะเริ่มจากเชื่อมต่อกับ Gemini และสร้าง prompt ก่อน:

			
# Import packages
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.prompts import ChatPromptTemplate
# Initialise Gemini
llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    temperature=0,
    google_api_key=GEMINI_API_KEY
)
# Create prompt template
prompt = ChatPromptTemplate.from_template("""
You are an HR policy assistant.
Answer the user's question using only the policy context below.
Rules:
- Do not use outside knowledge.
- If the answer is not in the context, say:
  "I could not find this information in the available company policies."
- Keep the answer concise.
- Mention the source policy file when possible.
Policy context:
{context}
User question:
{question}
""")

		

จากนั้น กำหนดคำถามและดึงเอกสารที่เกี่ยวข้องจาก vector database

			
# Ask a question
question = "Do I need a medical certificate for sick leave?"
# Retrieve relevant document chunks
relevant_docs = retriever.invoke(question)
# Combine retrieved chunks into one context string
context = "\n\n".join(
    [
        f"Source: {doc.metadata['source']}\n"
        f"{doc.page_content}"
        for doc in relevant_docs
    ]
)
# Inspect retrieved context before sending it to Gemini
print("Retrieved context:")
print(context)

		

ผลลัพธ์:

			
Retrieved context:
Source: documents/leave_policy.txt
DataWise Co. Leave Policy
Full-time employees receive 10 days of annual leave per year after completing probation.
Employees receive 15 days of paid sick leave per year.
Sick leave of 3 consecutive days or more requires a medical certificate.
Employees should submit annual leave requests at least 7 days in advance through the HR system.
Unused annual leave can be carried over for up to 5 days into the next calendar year.
Source: documents/compensation_policy.txt
DataWise Co. Compensation Policy
Salary is paid on the last working day of each month.
Performance bonuses are reviewed once per year in December.
Employees may receive an annual salary adjustment based on company performance, individual performance, and market benchmarks.
Overtime pay is available only for non-managerial employees and must be approved by a manager before the overtime work begins.

		

แล้วส่งข้อมูลคำถามและเอกสารให้กับ Gemini:

			
# Add context and question to prompt template
messages = prompt.invoke(
    {
        "context": context,
        "question": question
    }
)
# Send prompt to Gemini
response = llm.invoke(messages)
# Print Gemini's answer
print(response.content)

		

ผลลัพธ์:

			
Yes, sick leave of 3 consecutive days or more requires a medical certificate. (Source: documents/leave_policy.txt)

เพื่อให้เราใช้งาน RAG pipeline ได้ง่าย เราสามารถแปลงโค้ดชุดนี้ให้เป็น function ได้:

			
# Convert to function
def ask_policy_question(question: str) -> str:
    """
    Retrieve relevant policy chunks, send them to Gemini,
    and return Gemini's answer.
    """
    # Retrieve relevant chunks
    relevant_docs = retriever.invoke(question)
    # Combine retrieved chunks into context
    context = "\n\n".join(
        [
            f"Source: {doc.metadata['source']}\n"
            f"{doc.page_content}"
            for doc in relevant_docs
        ]
    )
    # Add context and question to prompt template
    messages = prompt.invoke(
        {
            "context": context,
            "question": question
        }
    )
    # Send completed prompt to Gemini
    response = llm.invoke(messages)
    # Return answer text
    return response.content

		

เพื่อที่เราจะเขียนโค้ดสั้นลงในครั้งถัด ๆ ไป:

			
# Test function
answer = ask_policy_question("Who is eligible for health insurance?")
print(answer)

ผลลัพธ์:

			
Full-time employees receive health insurance after completing probation. (Source: documents/benefits_policy.txt)

💪 Summary

ในบทความนี้ เราได้เรียนรู้การสร้าง RAG pipeline ด้วย langchain ใน 5 ขั้นตอน:

Load documents: โหลดเอกสารสำหรับ RAG pipeline
Split text: แบ่ง text ในเอกสารเป็น chunk
Embed and store chunks: แปลง chunk เป็น vector และเก็บลงใน database
Create a retriever: สร้างตัวค้นหาเอกสารจาก vector database
Generate a response: สร้างคำตอบจากเอกสาร

😺 GitHub

ดูตัวอย่าง code และเอกสารทั้งหมดได้ที่ GitHub

📃 References

2026-07-02

วิธีโหลดข้อมูล Google Sheets มาวิเคราะห์ใน Python บน Google Colab ใน 3 ขั้นตอน–ตัวอย่างจาก Harry Potter transaction dataset

Google Sheets เป็นเครื่องมือเก็บข้อมูลที่ทุกคนสามารถเข้าถึงได้ฟรี และมักเป็นที่เก็บข้อมูลทั้งส่วนตัว (เช่น รายรับรายจ่าย) และธุรกิจ (เช่น ข้อมูลการขาย ข้อมูลลูกค้า)

แม้ว่า Google Sheet จะวิเคราะห์ข้อมูลได้ แต่การวิเคราะห์จะมีประสิทธิภาพมากกว่า เมื่อเราใช้ programming language อย่าง Python เข้ามาช่วย

นอกจากความรวดเร็วในการประมวลผล และรองรับข้อมูลปริมาณมาก Python ยังสามารถวิเคราะห์แบบอัตโนมัติได้ด้วย เพียงแค่เราเขียน code รอเอาไว้

ในบทความนี้ เราจะมาดูวิธีการโหลดข้อมูลจาก Google Sheet เข้ามาวิเคราะห์ใน Python บน Google Colab กัน

บทความนี้แบ่งเป็น 3 ส่วน:

Load spreadsheet
Load worksheet
Load data

สำหรับคนที่ต้องการทำตาม สามารถดูไฟล์ตัวอย่างได้ตาม link:

ถ้าพร้อมแล้ว ไปเริ่มกันเลย

1️⃣ Step 1. Load Spreadsheet

เริ่มแรก เราจะโหลด spreadsheet ที่ต้องการ ใน 2 ขั้นตอน:

Authorise: ให้สิทธิ์การเข้าถึง Google Drive กับ Colab
Open spreadsheet: เชื่อมต่อ Google Sheet ที่ต้องการ

✅ 1.1 Authorise

เราเปิดสิทธิ์การเข้าถึง Google Drive ให้กับ Colab ได้แบบนี้:

			
# Grant Colab access to Google services
# Import package
from google.colab import auth
# Enable access to Google services
auth.authenticate_user()

		

เมื่อกด “Run”, Google จะพาเราไปที่หน้า Sign In ให้เรากด “Continue”:

จากนั้น ติ๊ก checkbox เพื่อให้สิทธิ์กับ Colab แล้วกด “Continue”

หลังเปิดสิทธิ์ ให้เราสร้าง client เพื่อเข้าถึง Google Drive:

			
# Connect to Google Drive
# Import packages
import gspread
from google.auth import default
# Get credentials and Google Cloud project ID
creds, _ = default()
# Create Google Sheet client
gc = gspread.authorize(creds)

		

Note:

default() จะคืนค่าให้ 2 อย่าง คือ credentials และ Google Cloud project ID
เราจะใช้เฉพาะ credentials
ส่วน Google Cloud project ID เราจะปล่อยทิ้งไป โดยเก็บไว้ใน _

📖 1.2 Open Spreadsheet

หลังจากสร้าง client แล้ว เราจะเชื่อมต่อกับ spreadsheet ซึ่งเราจะต้องเอา ID ของ spreadsheet มาจาก URL ตามตัวอย่างในรูป:

ให้เรา copy ID มาใช้แบบนี้:

			
# Load spreadsheet
# Define spreadsheet ID
spreadsheet_id = "12MglU8pFc_7XAylqANyqm8aLQNwvL98fXObjHEjrRvQ"
# Open spreadsheet
spreadsheet = gc.open_by_key(spreadsheet_id)
# Print spreadsheet title
print(spreadsheet.title)

		

ผลลัพธ์:

Diagon Alley Artefacts

ตอนนี้ เราก็โหลด spreadsheet สำเร็จแล้ว

ตัวอย่าง spreadsheet:

2️⃣ Step 2. Load Worksheet

หลังจากโหลด Google Sheet แล้ว เราจะเชื่อมต่อกับ worksheet ที่ต้องการ ใน 2 ขั้นตอน:

List: ดูรายชื่อ worksheet ทั้งหมดใน Google Sheet
Select: เลือก worksheet

📋 2.1 List

เราดูรายชื่อ worksheet ทั้งหมดได้แบบนี้:

			
# List worksheets
# Get all worksheet names
worksheets = spreadsheet.worksheets()
# Print them
for ws in worksheets:
    print(ws.title)

		

ผลลัพธ์:

			
transactions
Sheet2
Sheet3

🫳 2.2 Select

จากนั้น ให้เราโหลด worksheet ที่ต้องการ (เช่น transactions):

			
# Select worksheet
worksheet = spreadsheet.worksheet("transactions")

ตอนนี้ เราก็เชื่อมต่อกับ worksheet สำเร็จแล้ว

3️⃣ Step 3. Load Data

สุดท้าย เราจะโหลดข้อมูลจาก worksheet ใน 3 ขั้นตอน:

Read data: โหลดข้อมูลจาก worksheet
Convert to DataFrame: เปลี่ยนข้อมูลให้เป็น DataFrame
Analyse: วิเคราะห์ข้อมูลตามต้องการ

👓 3.1 Read Data

เราจะโหลดข้อมูลจาก worksheet แบบนี้:

			
# Get all data from worksheet
data = worksheet.get_all_values()
# Print result
data

โดยข้อมูลที่ได้จะเป็น list of lists (1 list = 1 row):

🐼 3.2 Convert to DataFrame

เพื่อช่วยให้เราวิเคราะห์ข้อมูลได้ง่าย เราจะเปลี่ยนข้อมูลให้เป็น DataFrame ด้วย pandas:

			
# Convert data to df
# Import package
import pandas as pd
# Convert
df = pd.DataFrame(
    data=data[1:],
    columns=data[0]
)
# Print result
df

		

ผลลัพธ์:

📈 3.3 Analyse

จากนั้น เราสามารถวิเคราะห์ข้อมูลได้ตามต้องการ เช่น คำนวณยอดขายทั้งหมด:

			
# Find total sales per category
# Convert column types to numeric
df["quantity"] = pd.to_numeric(df["quantity"], errors="coerce")
df["unit_price"] = pd.to_numeric(df["unit_price"], errors="coerce")
# Calculate total sales per row
df["revenue"] = df["quantity"] * df["unit_price"]
# Calculate sum sales
total_sales = df["revenue"].sum()
# Print result
print(total_sales)

		

ผลลัพธ์:

11600.0

Note:

ข้อมูลที่โหลดจะเป็น string ดังนั้น เราต้องเปลี่ยนตัวเลขให้กลายเป็น numeric หรือ float ก่อนเอาไปวิเคราะห์ต่อ
ดูวิธีวิเคราะห์ข้อมูลด้วย pandas

💪 Summary

ในบทความนี้ เราดูวิธีโหลดข้อมูล Google Sheet เข้ามาใน Python บน Google Colab ใน 3 ขั้นตอน:

Step 1. Load spreadsheet:

Code	For
`auth.authenticate_user()`	เปิดสิทธิ์เข้าถึง Google Drive
`default()`	รับ credentials
`gspread.authorize(creds)`	สร้าง client
`gc.open_by_key(spreadsheet_id)`	เชื่อมต่อ Google Sheet

Step 2. Load worksheet:

Code	For
`spreadsheet.worksheets()`	ดูรายชื่อ worksheet ทั้งหมด
`spreadsheet.worksheet("worksheet_name")`	เลือก worksheet

Step 3. Load data:

Code	For
`worksheet.get_all_values()`	โหลดข้อมูลใน worksheet
`pd.DataFrame(data=data[1:], columns=data[0])`	แปลงข้อมูลให้เป็น DataFrame

📃 References

2026-06-18

สรุป 5 keywords สำหรับ handle exceptions ใน Python: try, except, else, finally, raise — ตัวอย่างโค้ดการจ่ายเงินออนไลน์
Exception หมายถึง error ที่เกิดขึ้นกับ code ที่มี syntax ถูกต้อง

ยกตัวอย่างเช่น การหารเลขด้วย 0:
```
print(5 / 0)
```
ผลลัพธ์:
```
ZeroDivisionError
```
Exception สามารถทำให้ code หยุดทำงานหรือทำงานผิดพลาดได้

ดังนั้น ในการเขียน code เราควรกำหนดวิธีในการจัดการกับ exception เพื่อป้องกันไม่ให้ code ทำงานผิดพลาด

ใน Python เรามี 5 keywords สำหรับจัดการ exception ได้:
1. try
2. except
3. else
4. finally
5. raise
เราไปดูตัวอย่างการใช้งานทั้ง 5 keywords ผ่านตัวอย่าง code การจ่ายเงินออนไลน์กัน
🔨 try, except

try และ except เป็น keywords ที่ใช้ร่วมกัน โดยใน try เราจะใส่ code ที่เราคิดว่าอาจจะเกิด exception ขึ้นได้

ส่วนใน except เราจะใส่สิ่งที่เราต้องการให้เกิดขึ้นเมื่อเกิด exception ขึ้น

ยกตัวอย่างเช่น เราเขียน code เพื่อเช็กว่า payment มีค่ามากกว่า 0 หรือไม่ แต่ payment ที่ใส่เข้ามาอาจไม่ใช่ตัวเลข ซึ่งจะทำให้ code ของเราหยุดทำงาน:
```
# Without try, except

# Set payment
payment = "one thousand"

# Validate payment
if float(payment) < 0:
    print("Payment cannot be negative.")
```
ผลลัพธ์:
```
ValueError
```
เราสามารถใช้ try และ except ช่วยให้ code ทำงานต่อได้ พร้อมทำให้บอกเราให้รู้ว่า เกิดข้อผิดพลาดอะไรขึ้น:
```
# Set payment
payment = "one thousand"

# Code that may raise exception
try:
    if float(payment) < 0:
        print("Payment cannot be negative.")

# Print when exception occurs
except ValueError:
    print("Payment must be a number.")
```
ผลลัพธ์:
```
Payment must be a number.
```
🤔 else

else ทำงานคล้าย except แต่แทนที่จะส่งค่าบางอย่างกลับมาเมื่อเกิด exception, else จะทำงานเมื่อไม่มี exception เกิดขึ้นใน try

ยกตัวอย่างเช่น ใช้ else เพื่อแสดงข้อความว่ากำลังประมวลผล เมื่อ payment เป็นตัวเลข:
```
# Set payment
payment = 500

# Code that may raise exception
try:
    if float(payment) < 0:
        print("Payment cannot be negative.")

# Print when exception occurs
except ValueError as e:
    print(f"Error: {e}")

# Print when exception does not occur
else:
    print("Processing payment ...")
```
ผลลัพธ์:
```
Processing payment ...
```
☝️ finally

finally จะส่งค่ากลับมาไม่ว่าจะเกิด exception ขึ้นหรือไม่ก็ตาม

ยกตัวอย่างเช่น ใช้ finally แสดงข้อความขอบคุณลูกค้า ไม่ว่า payment จะผ่านหรือไม่ก็ตาม:
```
# Set payment
payment = 500

# Code that may raise exception
try:
    if float(payment) < 0:
        print("Payment cannot be negative.")

# Print when exception occurs
except ValueError as e:
    print(f"Error: {e}")

# Print when exception does not occur
else:
    print("Processing payment ...")

# Print no matter what
finally:
    print("Thank you for your payment.")
```
ผลลัพธ์:
```
Processing payment ...
Thank you for your payment.
```
👋 raise

สุดท้าย เราจะใช้ raise กำหนด exception ได้เอง

ยกตัวอย่างเช่น ใช้ raise เพื่อแจ้งเตือนเมื่อ payment ติดลบ:
```
# Set payment
payment = -50

# Code that may raise exception
try:
    if not isinstance(payment, (int, float)):
        raise TypeError("Payment must be a number.")
    if payment < 0:
        raise ValueError("Payment cannot be negative.")

# Print when exception occurs
except (TypeError, ValueError) as e:
    print(f"Error: {e}")

# Print when exception does not occur
else:
    print("Processing payment ...")

# Print no matter what
finally:
    print("Thank you for your payment.")
```
ผลลัพธ์:
```
Error: Payment cannot be negative.
Thank you for your payment.
```
💪 สรุป 5 Keywords

ในบทความนี้ เราได้เรียนรู้วิธีใช้ 5 keywords เพื่อจัดการ exception ใน Python ได้แก่:
1. try: รัน code ที่เราคิดว่าอาจเกิด exception
2. except: code ที่จะรันเมื่อเกิด exception จาก try
3. else: code ที่รันเมื่อไม่เกิด exception จาก try
4. finally: code ที่จะรันไม่ว่า try จะเกิด exception หรือไม่
5. raise: code สำหรับแสดง exception ที่กำหนดเอง
ตัวอย่าง code:
```
# Set payments
payments = {
    "Alex": "one thousand",
    "Barbara": -50,
    "Carter": 500
}

# Loop through payments
for name, payment in payments.items():
    
    # Print name and payment
    print(f"{name} paying {payment}.")
    
    # Code that may raise exception
    try:
        if not isinstance(payment, (int, float)):
            raise TypeError("Payment must be a number.")
        if payment < 0:
            raise ValueError("Payment cannot be negative.")

    # Print when exception occurs
    except (TypeError, ValueError) as e:
        print(f"Error: {e}")

    # Print when exception does not occur
    else:
        print("Processing payment ...")

    # Print no matter what
    finally:
        print("Thank you for your payment.")
        
    # Print divider
    print("\\n -------------------------------------------------- \\n")
```
ผลลัพธ์:
```
Alex paying one thousand.
Error: Payment must be a number.
Thank you for your payment.

 -------------------------------------------------- 

Barbara paying -50.
Error: Payment cannot be negative.
Thank you for your payment.

 -------------------------------------------------- 

Carter paying 500.
Processing payment ...
Thank you for your payment.

 -------------------------------------------------- 
```
📚 Further Reading: Python Exceptions

ศึกษาประเภทของ exception ใน Python ได้ที่: Python Built-in Exceptions

😺 GitHub

ดู code ทั้งหมดในบทความนี้ได้ที่ GitHub

📃 References
Share this:
Facebook
X
Like Loading…
2026-04-09
วิธีใช้ polars: package ทรงพลังสำหรับทำงานกับ tabular data ใน Python — ตัวอย่างการทำงานกับ IKEA Products dataset
polars เป็น package สำหรับทำงานกับข้อมูลในรูปแบบตาราง (tabular data) ใน Python และถูกพัฒนาด้วย Rust และ Apache Arrow ซึ่งทำให้ polars ประมวลผลได้เร็วและมีประสิทธิภาพสูง

polars เป็นทางเลือกสำหรับคนที่เบื่อกับข้อจำกัดของ pandas ซึ่งเป็น package ยอดนิยมสำหรับทำงานกับข้อมูลในรูปแบบตาราง โดย polars ได้เปรียบ pandas อยู่ 3 อย่าง:
1. Fast: ประมวลผลเร็วกว่า
2. Intuitive: มี syntax ที่ใช้ง่ายกว่า
3. Lazy: รองรับการเขียนแบบ lazy evaluation (ดูรายละเอียดเพิ่มเติมด้านล่าง) ทำให้ประมวลผลได้มีประสิทธิภาพมากกว่า
Note: ดูวิธีการใช้ pandas ได้ที่บทความนี้

Source: https://pola.rs/

ในบทความนี้ เราจะมาดูวิธีใช้ polars ผ่านตัวอย่างการทำงานกับ IKEA Products dataset ที่มีข้อมูลเฟอร์นิเจอร์จาก IKEA กัน

โดยบทความแบ่งเป็น 9 ส่วนดังนี้:
1. Import package and dataset: โหลด package และ dataset
2. Explore: สำรวจ dataset ก่อนทำงานกับข้อมูล
3. Select: เลือกข้อมูล
4. Filter: กรองข้อมูล
5. Sort: จัดเรียงข้อมูล
6. Aggregate: หาค่าทางสถิติ
7. Mutate: เพิ่ม ลบ แก้ไข column
8. Lazy: การทำงานแบบ lazy
9. Chaining: การเชื่อมต่อ function
ถ้าพร้อมแล้ว ไปเริ่มกันเลย
📦 Section 1. Import Package & Dataset

ในขั้นแรก เราจะโหลด package และ dataset ที่จะใช้งานกันก่อน

เราจะโหลด package ด้วย import แบบนี้:
```
import polars as pl
```
Note: ก่อนโหลด เราจะต้องติดตั้ง package ซึ่งเราสามารถทำได้ด้วย pip install

และโหลด dataset ด้วย read_csv() เพราะข้อมูลเป็นไฟล์ CSV:
```
df = pl.read_csv("ikea_products.csv")
```
ตอนนี้ เรามีข้อมูลพร้อมจะทำงานต่อแล้ว

🧭 Section 2. Explore

ในขั้นที่ 2 เราจะสำรวจข้อมูลที่เพิ่งโหลดเสร็จ ซึ่งเราทำได้ 5 วิธี:
1. shape
2. schema
3. head()
4. glimpse()
5. describe()
.

🔷 2.1 shape

shape เป็น attribute สำหรับเช็กจำนวน rows และ columns ใน dataset:
```
df.shape
```
ผลลัพธ์:

จากผลลัพธ์ จะเห็นว่า dataset มีข้อมูล 3,694 rows และมี 14 columns

.

🗺️ 2.2 schema

schema เป็น attribute สำหรับแสดงชื่อและประเภทข้อมูลของ columns:
```
df.schema
```
ผลลัพธ์:

.

🐵 2.3 head()

head() เป็น method สำหรับดู n rows แรกของข้อมูล เช่น ดู 10 แรกของข้อมูล:
```
df.head(10)
```
ตัวอย่างผลลัพธ์:

.

🔎 2.4 glimpse()

glimpse() เป็น method สำหรับดูโครงสร้างข้อมูล ซึ่งประกอบด้วย:
1. จำนวน rows และ columns
2. ชื่อ column
3. ประเภทข้อมูล
4. ตัวอย่างข้อมูล
```
df.glimpse()
```
ตัวอย่างผลลัพธ์:

.

.

📝 2.5 describe()

describe() เป็น method สำหรับแสดง summary statistics ของ columns:
1. count: จำนวนข้อมูล
2. null_count: จำนวนข้อมูลที่เป็นค่าว่าง
3. mean: ค่าเฉลี่ย
4. std: ค่าเบี่ยงเบนมาตรฐาน (standard deviation)
5. min: ค่าต่ำสุด
6. 25%, 50%, 75%: ข้อมูลที่ quartile ที่ 1, 2, และ 3
7. max: ค่าสูงสุด
```
df.describe()
```
ตัวอย่างผลลัพธ์:

🫳 Section 3. Select

เรามี 2 วิธีในการเลือก rows และ columns จากข้อมูล:
1. ใช้ []
2. ใช้ slice() และ select()
.

🔲 3.1 Using []

เราจะใช้ [] โดยกำหนด rows และ columns ที่ต้องการแบบนี้:
```
df[rows, cols]
```
ถ้าเราต้องการ rows หรือ columns ทั้งหมด ให้เราเว้นข้อมูลส่วนนั้นไว้ เช่น เลือกข้อมูล 10 rows แรก และ columns ทั้งหมด:
```
df[:10]
```
ตัวอย่างผลลัพธ์:

หรือเลือกเฉพาะ columns ชื่อ ประเภท และราคา และ rows ทั้งหมด:
```
df[["name", "category", "price"]]
```
ผลลัพธ์:

ถ้าต้องการทั้ง rows และ columns ให้เรากำหนดทั้งสองอย่าง เช่น ข้อมูล 10 rows แรก โดยเลือกเฉพาะ columns ชื่อ ประเภท และราคา:
```
df[0:10, ["name", "category", "price"]]
```
ผลลัพธ์:

.

🔪 3.2 Using slice() & select()

เราสามารถใช้ slice() และ select() เพื่อเลือกข้อมูลแทนการใช้ [] ได้ โดย:
1. ใช้ slice() เลือก rows
2. ใช้ select() เลือก columns
เช่น เลือกข้อมูล 10 rows แรก:
```
df.slice(0, 10)
```
ตัวอย่างผลลัพธ์:

เลือก columns ชื่อ ประเภท และราคา:
```
df.select(["name", "category", "price"])
```
ผลลัพธ์:

สุดท้าย เราสามารถใช้ทั้ง slice() และ select() ร่วมกันเพื่อเลือกทั้ง rows และ columns ได้แบบนี้:
```
df.slice(0, 10).select(["name", "category", "price"])
```
ผลลัพธ์:

👀 Section 4. Filter

เรากรองข้อมูลได้ด้วย filter() ซึ่งรับรองการกรองแบบ 1 เงื่อนไข และมากกว่า 1 เงื่อนไข

.

☝️ 4.1 One Condition

ตัวอย่างการกรองแบบ 1 เงื่อนไข เช่น เลือกเฉพาะข้อมูลของ outdoor furniture:
```
df.filter(pl.col("category") == "Outdoor furniture")
```
Note: สังเกตว่า เราใช้ col() เพื่อระบุ column ที่ต้องการ

ตัวอย่างผลลัพธ์:

.

🖐️ 4.2 Multiple Conditions

สำหรับการกรองหลายเงื่อนไข เราจะใช้ logical operator ช่วย:

Operator Meaning
& And
| Or
~ Not

เช่น เลือกข้อมูล outdoor furniture ที่ราคาสูงกว่า 1,000:
```
df.filter(
    (pl.col("category") == "Outdoor furniture") &
    (pl.col("price") > 1000)
)
```
ตัวอย่างผลลัพธ์:

↕️ Section 5. Sort

สำหรับจัดลำดับข้อมูล เราจะใช้ sort() ซึ่งรองรับการใช้งาน 3 กรณี:
1. Ascending: เรียงจากน้อยไปมาก (A–Z)
2. Descending: เรียงจากมากไปน้อย (Z–A)
3. Multiple columns: เรียงลำดับหลาย columns พร้อมกัน
.

⬆️ 5.1 Ascending

Default ในการจัดลำดับของ sort() คือ เรียงจากน้อยไปมาก เช่น จัดเรียงข้อมูลตามราคา:
```
df.sort("price")
```
ตัวอย่างผลลัพธ์:

.

⬇️ 5.2 Descending

ถ้าต้องการจัดเรียงแบบมากไปน้อย เราจะต้องกำหนด argument descending=True:
```
df.sort("price", descending=True)
```
ตัวอย่างผลลัพธ์:

.

🖐️ 5.3 Multiple Columns

ถ้าต้องการจัดลำดับหลาย columns พร้อมกัน เราจะกำหนด columns และวิธีจัดเรียง (ascending vs descending) เช่น จัดเรียงตามประเภทเฟอร์นิเจอร์ (A–Z) และราคา (Z–A):
```
df.sort(
    ["category", "price"],
    descending=[False, True]
)
```
ตัวอย่างผลลัพธ์:

🧮 Section 6. Aggregate

Aggregate คือ การสรุปข้อมูล เช่น หาค่าเฉลี่ย และทำได้ 2 วิธี:
1. แบบไม่จัดกลุ่ม ด้วยคำสั่ง select()
2. แบบจัดกลุ่ม ด้วยคำสั่ง group_by() และ agg()
.

🏠 6.1 Basic

ตัวอย่างสรุปข้อมูลโดยไม่จัดกลุ่ม เช่น หาค่าเฉลี่ย ค่าต่ำสุด และค่าสูงสุดของราคาเฟอร์นิเจอร์:
```
df.select(
    pl.col("price").mean().alias("Mean"),
    pl.col("price").min().alias("Min"),
    pl.col("price").max().alias("Max")
)
```
Note: alias() ใช้ตั้งชื่อ column

ผลลัพธ์:

.

🏘️ 6.2 Group By

ตัวอย่างสรุปข้อมูลแบบจัดกลุ่ม เช่น หาค่าเฉลี่ย ค่าต่ำสุด และค่าสูงสุดของราคาเฟอร์นิเจอร์ ตามประเภทเฟอร์นิเจอร์:
```
df.group_by("category").agg(
    pl.col("price").mean().alias("Mean"),
    pl.col("price").min().alias("Min"),
    pl.col("price").max().alias("Max")
)
```
ตัวอย่างผลลัพธ์:

💪 Section 7. Mutate

Mutate หมายถึง การปรับเปลี่ยน columns ที่มีอยู่ เช่น เพิ่มหรือลบ columns

.

➕ 7.1 Add Columns

ตัวอย่างการเพิ่ม columns เช่น:
1. เพิ่ม column ส่วนลด (discount) โดยราคามากกว่า 1,000 จะลด 15% และราคาน้อยกว่านั้นจะลด 10% และ
2. เพิ่ม column แสดงราคาหลังใช้ส่วนลดแล้ว (price_discounted)
เราสามารถเขียน code ได้ดังนี้:
```
df.with_columns(
    discount = pl.when(pl.col("price") > 1000)
    .then(0.15)
    .otherwise(0.10),
).with_columns(
    price_discounted = pl.col("price") * (1 - pl.col("discount"))
)
```
Note: เราใช้ when(), then(), otherwise() ช่วยกำหนดเงื่อนไขที่ต้องการ

ตัวอย่างผลลัพธ์:

สังเกตว่า columns ใหม่จะอยู่ต่อท้ายสุด

.

🗑️ 7.2 Remove Columns

เราลบ column ได้ด้วย drop() เช่น ลบ columns ราคาเก่า (old_price) และการขายออนไลน์ (sellable_online):
```
df.drop(["old_price", "sellable_online"])
```
ตัวอย่างผลลัพธ์:

🥱 Section 8. Lazy

Lazy evaluation เป็นการประมวลผลที่จะรันก็ต่อเมื่อได้รับคำสั่ง ซึ่งช่วยให้การทำงานมีประสิทธิภาพมากขึ้น เพราะการประมวลผลจะไม่เกิดขึ้นจนกว่าจะจำเป็น

Note: การประมวลผลในทันทีโดยไม่รอคำสั่ง เรียกว่า eager evaluation

การทำงานแบบ lazy evaluation มีอยู่ 3 ขั้นตอน:

ขั้นที่ 1. สร้าง LazyFrame ซึ่งเป็นข้อมูลสำหรับ lazy evaluation ด้วย lazy():
```
df_lz = df.lazy()
```
ขั้นที่ 2. เขียนคำสั่งที่ต้องการ เช่น เลือก columns:
```
execution = df_lz.select(["name", "category", "price"])
```
ขั้นที่ 3. สั่งให้ประมวลผลด้วยคำสั่ง collect():
```
execution.collect()
```
ผลลัพธ์:

🔗 Section 9. Chaining

Chaining เป็นการเชื่อมต่อ function เพื่อส่งผลลัพธ์จาก function หนึ่งไปยังอีก function หนึ่ง:
df.function1().function2().function3()...
Chaining ช่วยให้เราตอบโจทย์ที่ซับซ้อนขึ้นได้ เช่น:

สำหรับเฟอร์นิเจอร์ที่ Francis Cayouette ออกแบบ ประเภทไหนจัดว่าเป็น “Premium” (ราคาสูงกว่า 1,000) และ “Affordable” (ราคาน้อยกว่า 1,000)

เราสามารถใช้ polars เพื่อตอบโจทย์ได้แบบนี้:
```
df_lz.filter(
    pl.col("designer") == "Francis Cayouette"
).group_by(
    "category"
).agg(
    pl.col("price").mean().round().alias("avg_price")
).with_columns(
    pl.when(pl.col("avg_price") > 1000)
    .then(pl.lit("Premium"))
    .otherwise(pl.lit("Affordable"))
    .alias("price_label")
).sort(
    "avg_price",
    descending=True
).select(
    [
        "category",
        "price_label",
        "avg_price"
    ]
).collect()
```
ผลลัพธ์:

⭐️ Summary

ในบทความนี้ เราได้เห็นวิธีการใช้ polars เพื่อทำงานกับข้อมูลในรูปแบบตาราง ซึ่งสามารถสรุปเป็นการเขียน code 9 กลุ่มได้ดังนี้:

Section 1. Import package & dataset:
- import polars as pl
- pl.read_csv()
Section 2. Explore:
- df.shape
- df.schema
- df.head()
- df.glimpse()
- df.describe()
Section 3. Select:
- df[rows, cols]
- pl.slice()
- pl.select()
Section 4. Filter:
- df.filter()
- pl.col()
- &, |, ~
Section 5. Sort:
- df.sort()
Section 6. Aggregate:
- df.select()
- df.group_by().agg()
- alias()
Section 7. Mutate:
- df.with_columns()
- pl.when().then().otherwise()
- df.drop()
Section 8. Lazy:
- df.lazy()
- collect()
Section 9. Chaining:
- df.function1().function2().function()...
⏭️ Next Step: DIY

ใครที่อยากฝึกใช้ polars สามารถดูตัวอย่าง code และ dataset ได้ที่ GitHub

📃 References
🔔 ใครที่ชอบบทความนี้ ฝากกด subscribe และติดตามกันได้ที่:
- Website: shinoshigoto.com
- Facebook: Svaron Solution
- Instagram: @svaronsolution
- Thread: @svaronsolution
Share this:
Facebook
X
Like Loading…
2026-03-05
โหลดข้อมูลจาก database ใน 4 ขั้นตอน ด้วย sqlalchemy และ pandas ใน Python — ตัวอย่างการทำงานกับ Chinook database
ในบทความนี้ เราจะมาดู 4 ขั้นตอนในการโหลดข้อมูลจาก database ด้วย sqlalchemy และ pandas libraries ใน Python ผ่านตัวอย่างการทำงานกับ Chinook database กัน:
1. Import libraries
2. Connect to the database
3. List the tables
4. Get the table
ถ้าพร้อมแล้ว ไปเริ่มกันเลย
⬇️ 1. Import Libraries

ในขั้นแรก เราจะโหลด sqlalchemy และ pandas กัน:
```
# Import packages
from sqlalchemy import create_engine, inspect
import pandas as pd
```
Note: ถ้ายังไม่เคยติดตั้ง libraries ให้ใช้คำสั่ง !pip install ก่อนใช้ import

🛜 2. Connect to the Database

ในขั้นที่ 2 เราจะเชื่อมต่อกับ database

ในตัวอย่าง เราจะเชื่อมต่อกับ SQLite database บนเครื่อง ซึ่งเราสามารถทำได้ด้วย create_engine() แบบนี้:
```
# Connect to the database
engine = create_engine("sqlite:///chinook.sqlite")
```
Note: ดาวน์โหลด chinook.sqlite ได้ที่ GitHub

📋 3. List the Tables

ในขั้นที่ 3 เราจะโหลดรายชื่อ tables ใน database เพื่อเลือก tables ที่เราต้องการ

เราจะใช้ 2 คำสั่ง ได้แก่:
- inspect(): function สำหรับสร้าง object ที่เก็บ metadata ของ database เอาไว้
- .get_table_names(): method สำหรับแสดงรายชื่อ tables ใน database
```
# Get the inspector
inspector = inspect(engine)

# List the table names
tables = inspector.get_table_names()

# Print the table names
print(tables)
```
ผลลัพธ์:
```
['Album', 'Artist', 'Customer', 'Employee', 'Genre', 'Invoice', 'InvoiceLine', 'MediaType', 'Playlist', 'PlaylistTrack', 'Track']
```
🪑 4. Get the Table

ในขั้นสุดท้าย เราจะโหลดข้อมูลจาก table ที่ต้องการ โดยใช้ pd.read_sql():
```
# Set the query
brazil_customers_query = """
SELECT FirstName, LastName, Phone, Email
FROM Customer
WHERE Country = 'Brazil';
"""

# Query the database
df = pd.read_sql(brazil_customers_query, engine)

# Display the df
print(df)
```
ผลลัพธ์:
```
   FirstName   LastName               Phone                          Email
0       Luís  Gonçalves  +55 (12) 3923-5555           luisg@embraer.com.br
1    Eduardo    Martins  +55 (11) 3033-5446       eduardo@woodstock.com.br
2  Alexandre      Rocha  +55 (11) 3055-3278               alero@uol.com.br
3    Roberto    Almeida  +55 (21) 2271-7000  roberto.almeida@riotur.gov.br
4   Fernanda      Ramos  +55 (61) 3363-5547       fernadaramos4@uol.com.br
```
😺 GitHub

ดูตัวอย่าง code ทั้งหมดได้ที่ GitHub

📃 References
- Introduction to Importing Data in Python
- Importing Data in Python Cheat Sheet
Share this:
Facebook
X
Like Loading…
2026-02-19
สร้าง chatbot ส่วนตัว ใน 5 ขั้นตอน ด้วย OpenAI library ใน Python — ตัวอย่างการสร้าง Gemini chatbot
ในบทความนี้ เราจะมาดูวิธีสร้าง chatbot ส่วนตัว ด้วย openai library ใน Python ใน 5 ขั้นตอนกัน:
1. Import libraries
2. Create a client
3. Create a chat history
4. Create a chat function
5. Chat
Note: เราจะรัน code ตัวอย่างบน Google Colab ซึ่งทุกคนสามารถดูได้ Gemini Chatbot in Google Colab

ถ้าพร้อมแล้ว ไปเริ่มกันเลย
🏁 Step 1. Import Libraries

ในขั้นแรก เราจะโหลด 2 libraries ที่เกี่ยวข้อง ซึ่งได้แก่:
1. openai: สำหรับเรียกใช้ API ของ AI service *
2. display และ Markdown: สำหรับแสดง markdown text (อย่างคำตอบที่ส่งมาจาก AI) ให้อ่านง่าย
```
# Import libraries

# For Gemini
from openai import OpenAI

# For text rendering
from IPython.display import display, Markdown
```
Note: * openai library ถูกออกแบบสำหรับ OpenAI API แต่สามารถใช้งานกับ AI อื่น ๆ ได้ เช่น:
- Gemini
- Claude
- Typhoon
💁‍♂️ Step 2. Create a Client

ในขั้นที่ 2 เราจะสร้าง client เพื่อเชื่อมต่อกับ AI ที่เป็น “สมอง” ของ chatbot ด้วย OpenAI() ซึ่งต้องการ 2 arguments ได้แก่:
1. api_key: รหัส API ของเรา
2. base_url: URL สำหรับเรียกใช้ API
ในตัวอย่าง เราจะเรียกใช้ Gemini ซึ่งเราสามารถกำหนด arguments ได้ดังนี้:
```
# Create client
client = OpenAI(
    api_key="YOUR_API_KEY_HERE",
    base_url="<https://generativelanguage.googleapis.com/v1beta/openai/>"
)
```
Note:
- ใส่ API key ใน "YOUR_API_KEY_HERE"
- ดูวิธีสร้าง API key ฟรีได้ที่ Using Gemini API keys
- สำหรับคนที่จะเรียกใช้ OpenAI API (ChatGPT) แทน Gemini เราสามารถข้ามการเขียน base_url ไปได้
🙊 Step 3. Create a Chat History

ในขั้นที่ 3 เราจะสร้าง chat history เพื่อเก็บ:
1. System prompt ที่กำหนดพฤติกรรมของ chatbot (ในตัวอย่าง เราจะกำหนดให้เป็นผู้ช่วยที่กระตือรือร้น)
2. ประวัติการพูดคุยระหว่างเรากับ chatbot ซึ่งจะทำให้ chatbot จำสิ่งที่คุยกันได้
```
# Set system prompt
system_prompt = """
You are a helpful, cheerful, and optimistic assistant.

Be concise, validate answers, and admit when you don’t know.

Make responses clear, easy to read, and sprinkle in playful emoji.
"""

# Instantiate chat history
chat_history = [
    {
        "role": "system",
        "content": system_prompt
    }
]
```
📨 Step 4. Create a Chat Function

ในขั้นที่ 4 เราจะสร้าง function ที่จะทำให้เราถาม-ตอบกับ chatbot แบบ real-time ได้:
```
# Create a function for chatbot
def chatbot(model="gemini-2.5-flash"):

    # Set chat history as global variable
    global chat_history

    # Print chat header
    display(Markdown("# 🟢 --- Chat Begins ---"))

    # Print chat instruction
    print("ℹ️ Type \\"end chat\\" to exit.")

    # Loop through conversation
    while True:

        # Render user prompt display
        display(Markdown("## 🧑‍💻 You:"))

        # Get user input
        user_prompt = input("")

        # Check if user wants to exit chat
        if user_prompt.lower() == "end chat":

            # Print goodbye message
            display(Markdown("## ✨ Assistant:\\n" + "👋 See you later!"))

            # End chat
            break

        # Append user input to chat history
        chat_history.append(
            {
                "role": "user",
                "content": user_prompt
            }
        )

        # Get response
        response = client.chat.completions.create(

            # Set prompt
            messages=chat_history,

            # Set model
            model=model
        )

        # Append response to history
        chat_history.append(
            {
                "role": "assistant",
                "content": response.choices[0].message.content
            }
        )

        # Render response
        display(Markdown("## ✨ Assistant:\\n" + response.choices[0].message.content + "\\n"))
```
💬 Step 5. Chat

ในขั้นสุดท้าย เราจะเรียกใช้งาน chatbot() เพื่อเริ่มคุยกับ AI เลย:
```
# Start chatting
chatbot()
```
ผลลัพธ์:

👍 Google Colab

ดูตัวอย่าง code ทั้งหมดได้ที่ Google Colab

📃 References
- Intro — Getting Started with Gemini API using Google Colab
- OpenAI compatibility
Share this:
Facebook
X
Like Loading…
2026-02-12
Python for AI: รวบรวม 8 บทความการทำงานกับ AI ใน Python
ในช่วงที่ผ่านมา ผมมีโอกาสแชร์การใช้ Python เพื่อทำงานกับ AI จากการที่ผมได้ทำงานเกี่ยวกับ AI มากขึ้น

เพื่อช่วยในการแชร์ ผมได้สรุปเนื้อหาไว้ใน 8 บทความ (5 กลุ่ม) ซึ่งทุกคนสามารถอ่านตามได้ดังนี้:

🐍 Session #1. Intro to Python:
- Intro to Python: แนะนำการใช้งานและประเภทข้อมูลใน Python
🔁 Session #2. Control flow:
- Control flow: สอนใช้ statement เช่น if, for, while เพื่อควบคุมการทำงานของ Python
💻 Session #3. Functions:
- Functions: สอนการสร้าง function ใน Python
📦 Session #4. Packages and files:
- open(): สอนการทำงานกับไฟล์ด้วย base Python
- json package: สอนการทำงานกับ JSON ด้วย json package
- pd.read_csv(): สอนการทำงานกับ CSV ด้วย pandas package
🤖 Session #5. AI packages:
- openai package: สอนการทำงานกับ AI API ผ่าน openai package
- google-genai package: สอนการใช้ google-genai เพื่อทำงานกับ Gemini API
Share this:
Facebook
X
Like Loading…
2025-12-23
วิเคราะห์ resumes ใน 3 ขั้นตอน ด้วย Gemini ผ่าน OpenAI library ใน Python — ตัวอย่างการทำงานใน Google Colab
บทความนี้เหมาะสำหรับบริษัทหรือ HR ที่ต้องการใช้ AI ช่วยลดเวลาในการคัดกรองผู้สมัคร เพราะเราจะมาดูวิธีวิเคราะห์ resumes ด้วย Gemini ผ่าน OpenAI library ใน Python กัน

บทความนี้แบ่งเป็น 3 ส่วนตามขั้นตอนการวิเคราะห์ ได้แก่:
1. Install and load libraries
2. Set input
3. Analyse resumes
โดยเราจะไปดูตัวอย่างโดยใช้ Google Colab กัน (ดู code ทั้งหมดได้ที่นี่)

ถ้าพร้อมแล้ว ไปเริ่มกันเลย
⬇️ 1. Install & Load Libraries

ในขั้นแรก เราจะเรียกติดตั้งและโหลด libraries ที่จำเป็นดังนี้:
- openai: สำหรับเรียกใช้ AI ผ่าน API
- drive จาก google.colab: สำหรับเชื่อมต่อกับไฟล์ใน Google Drive
- PyPDF2: สำหรับดึง text ออกจากไฟล์ PDF
- textwrap: สำหรับลบย่อหน้าออกจาก string
- Console จาก rich.console และ Markdown จาก rich.markdown: สำหรับ render การแสดงผล string ให้อ่านง่ายขึ้น
ติดตั้ง:
```
# Install libraries
!pip install PyPDF2
```
Note: Google Colab มี libraries อื่น ๆ อยู่แล้ว ทำให้เราแค่ต้องติดตั้ง PyPDF2 อย่างเดียว

โหลด:
```
# Load libraries

# Connect to Gemini
from openai import OpenAI

# Connect to Google Drive
from google.colab import drive

# Extract text from PDF
import PyPDF2

# Dedent text
import textwrap

# Render markdown text
from rich.console import Console
from rich.markdown import Markdown
```
🔧 2. Set the Input

สำหรับการวิเคราะห์ resumes เราต้องการ input 3 อย่าง ได้แก่:
1. Client: สำหรับเรียกใช้ Gemini API
2. Job description (JD): รายละเอียดตำแหน่งงานที่กำลังต้องการพนักงาน
3. Resumes: ข้อมูล resume ที่เราต้องการวิเคราะห์
เราไปดูวิธีกำหนด input แต่ละตัวกัน

.

🧑‍💻 (1) Client

เราสามารถกำหนด client ได้ด้วย OpenAI() ซึ่งต้องการ 2 arguments:
1. api_key: API key สำหรับเชื่อมต่อ API
2. base_url: base URL สำหรับเรียกใช้ AI service ซึ่งสำหรับ Gemini เราต้องกำหนดเป็น "<https://generativelanguage.googleapis.com/v1beta/openai/>"
ในตัวอย่าง เราจะเรียกใช้ OpenAI() แบบนี้:
```
# Create a client
client = OpenAI(api_key="YOUR_API_KEY", base_url="<https://generativelanguage.googleapis.com/v1beta/openai/>")
```
Note: ในกรณีใช้งานจริง ให้แทนที่ "YOUR_API_KEY" ด้วย API key จริง (ดูวิธีสร้าง API key ฟรีได้ที่ Using Gemini API keys)

.

💼 (2) JD

Input ที่ 2 สำหรับการวิเคราะห์ คือ JD ซึ่งเราสามารถกำหนดเป็น string ได้แบบนี้:
```
# Set the job description (JD)
web_dev_jd = """
Senior Web Developer

We're looking for a Senior Web Developer with a strong background in front-end development and a passion for creating dynamic, intuitive web experiences. The ideal candidate will have extensive experience with the entire development lifecycle, from project conception to final deployment and quality assurance. This role requires a blend of technical skill, creative collaboration, and a commitment to solving complex programming challenges.

Responsibilities
* Cooperate with designers to create clean, responsive interfaces and intuitive user experiences.
* Develop and maintain project concepts, ensuring an optimal workflow throughout the development cycle.
* Work with a team to manage large, complex design projects for corporate clients.
* Complete detailed programming tasks for both front-end and back-end server code.
* Conduct quality assurance tests to discover errors and optimize usability for all projects.

Qualifications
* Bachelor's degree in Computer Information Systems or a related field.
* Proven experience in all stages of the development cycle for dynamic web projects.
* Expertise in programming languages including PHP OOP, HTML5, JavaScript, CSS, and MySQL.
* Familiarity with various PHP frameworks such as Zend, Codeigniter, and Symfony.
* A strong background in project management and customer relations.
"""
```
Note: ในกรณีที่ JD เป็นไฟล์ PDF เราสามารถใช้วิธีดึงข้อมูลแบบเดียวกันกับ resumes ได้

.

📄 (3) Resumes

Input สุดท้าย คือ resumes ที่เราต้องการวิเคราะห์

ในตัวอย่าง เราจะดึงข้อมูล resumes จากไฟล์ PDF ใน Google Drive ใน 3 ขั้นตอน ได้แก่:

ขั้นที่ 1. เชื่อมต่อ Google Drive ด้วย drive.mount():
```
# Connect to Google Drive
drive.mount("/content/drive")
```
Note: Google จะถามยืนยันการให้สิทธิ์เข้าถึงไฟล์ใน Drive ให้เรากดยืนยันเพื่อไปต่อ

ขั้นที่ 2. กำหนด file path ของไฟล์ PDF ใน Google Drive:
```
# Set resume file paths
rs_file_paths = {
    "George Evans": "/content/drive/My Drive/Resumes/cv_george_evans.pdf",
    "Robert Richardson": "/content/drive/My Drive/Resumes/cv_robert_richardson.pdf",
    "Christine Smith": "/content/drive/My Drive/Resumes/cv_christine_smith.pdf"
}
```
Note: ในตัวอย่าง จะเห็นว่า เรามี resumes 3 ใบ (ดาวน์โหลด resumes ฟรีได้ที่ www.coolfreecv.com)

ขั้นที่ 3. ดึง text ออกจาก resumes ด้วย for loop และ PyPDF2:
```
# Extract resume texts

# Instantiate a collector
rs_texts = {}

# Loop through resume files to get text
for key in rs_file_paths:

    # Instantiate an empty string to store the extracted text
    rs_text = ""

    # Open the PDF file
    reader = PyPDF2.PdfReader(rs_file_paths[key])

    # Loop through the pages
    for i in range(len(reader.pages)):

        # Extract the text from the page
        text = reader.pages[i].extract_text()

        # Append the text to the string
        rs_text += text

    # Collect the extracted text
    rs_texts[key] = rs_text
```
ตัวอย่าง PDF และข้อมูลที่ดึงจาก PDF:

Source: www.coolfreecv.com
```
Contact  
+1 (970) 343  888 999 
george.evans@gmail.com  
<https://www.coolfreecv.com>  
32 ELM STREET MADISON, SD 
57042  
 George  Evans  
PHP / OOP   
Zend Framework  Summary  
Senior Web Developer specializing in front end development . 
Experienced with all stages of the development cycle for dynamic 
web projects. Well -versed in numerous programming languages 
including HTML5, PHP OOP, JavaScript, CSS, MySQL. Strong 
background in project management and customer relations. 
Perceived as versatile, unconventional and committed, I am 
looking for new and interesting programming challenges.  
Experience  
Web Developer - 09/201 8 to 05/20 22 
Luna Web Design, New York  
• Cooperate with designers to create clean interfaces and 
simple, intuitive interactions and experiences.  
• Develop project concepts and maintain optimal workflow.  
• Work with senior developer to manage large, complex 
design projects for corporate clients.  
• Complete detailed programming and development tasks 
for front end public and internal websites as well as 
challenging back -end server code.  
• Carry out quality assurance tests to discover errors and 
optimize usability.  
Education  
Bachelor of Science: Computer Information Systems  - 2018  
Columbia University, NY  
 
Certifications  
PHP Framework (certificate): Zend, Codeigniter, Symfony. 
Programming Languages: JavaScript, HTML5, PHP OOP, CSS, SQL, 
MySQL.  
Reference  
Adam Smith - Luna Web Design  
adam.smith@luna.com  +1(970 )555 555  Skills   
JavaScript   Symfony Framework
```
⚡ 3. Analyse the Resumes

ในขั้นสุดท้าย เราจะเปรียบเทียบความเหมาะสมของ resumes กับตำแหน่งงาน (JD) ใน 4 ขั้นตอน ดังนี้:
1. สร้าง function เพื่อเรียกใช้ Gemini
2. สร้าง function เพื่อใส่ input ใน prompt
3. วิเคราะห์ resumes โดยใช้ for loop และ functions จากข้อ 1, 2
4. Print ผลการวิเคราะห์
.

🤖 (1) Function เรียกใช้งาน Gemini

ในขั้นแรก เราจะสร้าง function สำหรับเรียกใช้ Gemini เพื่อให้ง่ายในการใช้งาน AI

ในตัวอย่าง เราจะกำหนด 3 arguments สำหรับ function:
1. prompts: list เก็บ system prompt และ user prompt
2. model: model ของ Gemini ที่เราจะเรียกใช้ (เช่น Gemini 2.5 Flash)
3. temp: ระดับความคิดสร้างสรรค์ของ model โดยมีค่าระหว่าง 0 และ 2 โดย 0 จะทำให้ model ให้คำตอบเหมือนกันทุกครั้ง และ 2 คำตอบจะแตกต่างกันทุกครั้ง
```
# Create a function to get a Gemini response
def get_gemini_response(prompts, model, temp):

    # Generate a response
    response = client.chat.completions.create(

        # Set the prompts
        messages=prompts,

        # Set the model
        model=model,

        # Set the temperature
        temperature=temp
    )

    # Return the response
    return response.choices[0].message.content
```
.

➕ (2) Function ใส่ Input ใน Prompt

ในขั้นที่ 2 เราจะสร้าง function เพื่อประกอบ input เข้ากับ prompt เพื่อพร้อมที่จะนำไปใช้ใน function ในขั้นที่ 1

ในตัวอย่างเราจะสร้าง function แบบนี้:
```
# Create a function to concatenate prompt + JD + resume
def concat_input(jd_text, rs_text):

    # Set the system prompt
    system_prompt = """
    # 1. Your Role
    You are an expert technical recruiter and resume analyst.
    """

    # Set the user prompt
    user_prompt = f"""
    # 2. Your Task
    Your task is to meticulously evaluate a candidate's resume against a specific job description (JD) and provide a detailed pre-screening report.

    Your analysis must be structured with the following sections and include specific, data-driven insights.

    ## 1. Strengths
    - Identify and elaborate on top three key strengths.
    - For each strength, briefly provide specific evidence from the resume (e.g., "The candidate's experience with Python and Django, as shown in their role at Acme Corp, directly addresses the JD's requirement for...") and explain how it directly fulfills a requirement in the JD.

    ## 2. Weaknesses
    - Identify top three areas where the candidate's experience or skills may not fully align with the JD's requirements.
    - For each point, briefly explain the potential concern and why it might be a risk for the role (e.g., "The JD requires experience with AWS, but the resume only mentions exposure to Azure. This could indicate a gap in cloud infrastructure expertise.").

    ## 3. Candidate Summary
    - Draft a concise summary of the candidate's professional background.
    - Emphasise their JD-relevant core responsibilities, key achievements, and career progression as evidenced in the resume.

    ## 4. Overall Fit Score
    - Provide a numerical score from 1 to 100, representing the overall alignment of the candidate's profile with the JD.
    - A higher score indicates a stronger match: 80-100 = best match; 60-80 = strong match; 0-40 = weak match.

    ## 5. Hiring Recommendation
    - Conclude with a clear, binary hiring recommendation: "🟢 Proceed to interview", "🟡 Add to waitlist", or "🔴 Do not proceed".
    - Justify this recommendation with a brief, objective explanation based on the analysis above.

    ---

    # 3. Your Output
    - Use a professional and objective tone.
    - Base your analysis solely on the provided resume and JD. Do not make assumptions.
    - Be concise and to the point; no more than 30 words per sentence; the hiring manager needs to quickly grasp the key findings.
    - Format your final report using markdown headings and bullet points for readability.

    Output template:
    '''
    # [candidate's name (Title Case)] ([fit score]/100)

    [recommendation]: [justification]

    ## Profile Summary:
    [summary]

    ## Strengths:
    - [strength 1]
    - [strength 2]
    - [strength 3]

    ## Weaknesses:
    - [weakness 1]
    - [weakness 2]
    - [weakness 3]
    '''

    ---

    # 4. Your Input
    **1. JD:**
    {jd_text}

    **2. Resume:**
    {rs_text}

    ---

    Generate the report.
    """

    # Collect prompts
    prompts = [
        {
            "role": "system",
            "content": textwrap.dedent(system_prompt)
        },
        {
            "role": "user",
            "content": textwrap.dedent(user_prompt)
        }
    ]

    # Return the prompts
    return prompts
```
Note: เราใช้ textwrap.dedent() เพื่อลบย่อหน้าที่เกิดจาก indent ใน function ออกจาก prompt เพื่อป้องกันความผิดพลาดในการประมวลผลของ AI และประหยัด input token

.

🤔 (3) วิเคราะห์ Resumes

ในขั้นที่ 3 ซึ่งเป็นขั้นที่สำคัญที่สุด เราจะวิเคราะห์ resumes โดย:
- ใช้ functions จากขั้นที่ 1 และ 2 เพื่อสร้าง prompt และส่ง prompt ให้กับ Gemini
- ใช้ for loop เพื่อส่ง resumes ให้กับ Gemini จนครบทุกใบ
```
# Instantiate a response collector
results = {}

# Loop through the resumes
for rs_name, rs_text in rs_texts.items():

    # Create the prompts
    prompts = concat_input(web_dev_jd, rs_text)

    # Get the Gemini response
    response = get_gemini_response(prompts=prompts, model="gemini-2.5-flash", temp=0.5)

    # Collect the response
    results[rs_name] = response
```
เมื่อรัน code นี้แล้ว เราจะได้ผลลัพธ์เก็บไว้ใน results

.

👀 (4) Print ผลลัพธ์

สุดท้าย เราจะ print ผลการวิเคราะห์ออกมา โดย:
- ใช้ for loop ช่วย print ผลจนครบ
- ใช้ Console กับ Markdown เพื่อทำให้ข้อความอ่านง่ายขึ้น:
```
# Instantiate a console
console = Console()

# Instantiate a counter
i = 1

# Print the results
for rs_name, analysis_result in results.items():

    # Print the resume name
    print(f"👇 {i}. {rs_name}:")

    # Print the response
    console.print(Markdown(analysis_result))

    # Add spacers and divider
    print("\\n")
    print("-----------------------------------------------------------")
    print("\\n")

    # Add a counter
    i += 1
```
ตัวอย่างผลลัพธ์:

ในตัวอย่าง จะเห็นได้ว่า George Evans เหมาะที่จะเป็น Senior Web Developer

😺 Code & Input Examples
- ดูตัวอย่าง code ได้ที่ Google Colab
- ดูตัวอย่าง JD และ resumes ได้ที่ JD & Resumes
📃 References
Share this:
Facebook
X
Like Loading…
2025-11-13

Operator	Meaning
`&`	And
\|	Or
`~`	Not

4 ขั้นตอนในการใช้ google-genai library เพื่อทำงานกับ Gemini API — ตัวอย่างการสร้างสูตรอาหารที่ไม่เหมือนใคร

ในบทความนี้ เราจะมาดู 4 ขั้นตอนในการใช้งาน google-genai ซึ่งเป็น official library สำหรับทำงานกับ Gemini API ผ่านตัวอย่างการสร้างสูตรอาหารใน Google Colab กัน:

Import packages
Create client
Create function
Generate response

ถ้าพร้อมแล้ว ไปเริ่มกันเลย

📦 Import Packages

เริ่มแรก เราจะ import 4 packages ที่จำเป็น ได้แก่:

From	Function/Class	For
`google`	`genai`	ทำงานกับ Gemini API
`google.genai.types`	`GenerateContentConfig`	ตั้งค่า Gemini
`google.colab`	`userdata`	เรียก API key จากเมนู Secrets ใน Google Colab
`pydantic`	`BaseModel`	กำหนดโครงสร้างของ response จาก Gemini

# Import packages

# google-genai library
from google import genai
from google.genai.types import GenerateContentConfig

# Secret key
from google.colab import userdata

# pydantic
from pydantic import BaseModel

🧑‍💼 Create Client

ในขั้นที่ 2 เราจะสร้าง client สำหรับทำงานกับ Gemini API

เพื่อความปลอดภัย เราจะเก็บ API key ไว้ในเมนู Secrets ของ Google Colab

เราสามารถเพิ่ม API key ด้วยการ import ผ่านปุ่ม “Gemini API keys” หรือผ่านการเพิ่ม API key เองด้วยปุ่ม “Add new secret”:

หลังสร้าง API key ใน Secrets แล้ว เราสามารถเรียกใช้ API key ได้ด้วย userdata.get() ซึ่งต้องการ 1 argument คือ ชื่อ secret:

# Get API key
my_api = userdata.get("GOOGLE_API_KEY")

จากนั้น เราจะสร้าง client ด้วย genai.Client() ซึ่งต้องการ 1 argument คือ API key:

# Create client
client = genai.Client(api_key=my_api)

Note:

ในกรณีที่เราไม่ห่วงความปลอดภัยของ API key เราสามารถใส่ API key ใน genai.Client() ได้โดยตรง เช่น genai.Client(api_key="g04821...")
เราสามารถสร้าง API key ได้ฟรี โดยไปที่ Google AI Studio และกด “Create API key”

📲 Create Function

ในขั้นที่ 3 เราจะสร้าง function สำหรับเรียกใช้ Gemini ซึ่งต้องการ 3 arguments:

model: Gemini model ที่เราจะเรียกใช้
user_prompt: กำหนด user prompt
config: กำหนดการตั้งค่าต่าง ๆ ของ model

โดยทั้ง 3 arguments จะอยู่ใน client.models.generate_content():

# Create a function to get Gemini response
def get_response(model, user_prompt, config):

    # Get response
    response = client.models.generate_content(

        # Set model
        model=model,

        # Set user prompt
        contents=user_prompt,

        # Set config
        config=config
    )

    # Return response
    return response.text

📬 Generate Response

ในขั้นที่ 4 เราจะ get response จาก Gemini โดยใช้ function ที่เราสร้างในขั้นที่ 3

เนื่องจาก function ต้องการ 3 arguments เราจะต้องกำหนด 3 สิ่งนี้ก่อนที่จะสร้าง response ได้:

Model
User prompt
Configuration

🤖 Set Model

ในตัวอย่างนี้ เราจะใช้ model เป็น Gemini 2.5 Flash ซึ่งเราสามารถกำหนดได้ดังนี้:

# Set model
gemini_model = "gemini-2.5-flash"

Note: ดูชื่อ model อื่น ๆ ได้ที่ Gemini Models

🧑‍💻 Set User Prompt

สำหรับ user prompt เราสามารถกำหนดเป็น string ได้แบบนี้:

# Set user prompt
gemini_user_prompt = """
Create a healthy Thai-inspired burger for one person.

Protein: chicken or tofu
Bun: whole-wheat if possible (or lettuce wrap)

Deliver (match field names exactly):
- `menu` (string)
- `ingredient` (list of items with name, description, amount, unit)
- `steps` (30-word strings)
- `calorie_kcal` (float, total for the dish)
"""

🛠️ Set Configuration

สำหรับ configuration เราสามารถตั้งค่า model ได้หลายค่า

ในตัวอย่างนี้ เราจะเลือกกำหนด 3 ค่า ได้แก่:

System prompt
Temperature
Output type and structure

ค่าที่ 1. System prompt คือ prompt ที่กำหนดพฤติกรรมของ Gemini ในการตอบสนองต่อ user prompt ของเรา

เราสามารถกำหนด system prompt เป็น string ได้แบบนี้:

# Set system prompt
system_prompt = """
You are a highly experienced home cook specialising in healthy Thai-style food.

Constraints:
- Single-serving
- Favour grilling/pan-searing over deep-frying
- Keep ingredients common in Thai kitchens
- Keep steps <=7
- Include an approximate total calories for the whole dish
- Keep language simple
- Return JSON only that matches the given schema exactly (no extra fields)
"""

ค่าที่ 2. Temperature มีค่าระหว่าง 0 และ 2 โดย:

0 จะทำให้ response ตายตัว (deterministic) มากขึ้น
2 จะทำให้ response สร้างสรรค์ (creative) มากขึ้น

Note: ค่า default ของ temperature อยู่ที่ 1 (Generate content with the Gemini API in Vertex AI)

ในตัวอย่าง เราจะกำหนด temperature เป็น 2 เพื่อให้ response มีความสร้างสรรค์สูงสุด:

# Set temperature
temp = 2

ค่าที่ 3. สำหรับ output type และ structure เราจะกำหนดดังนี้:

กำหนด type เป็น "application/json" เพื่อให้ response อยู่ในรูป JSON object:

# Set output type
output_type = "application/json"

Note: ดู type อื่น ๆ ได้ที่ Structured output

กำหนดโครงสร้างของ JSON object ด้วย class และ BaseModel:

# Set output structure
class Ingredient(BaseModel):
    name: str
    description: str
    amount: float
    unit: str

class OutputStructure(BaseModel):
    menu: str
    ingredient: list[Ingredient]
    steps: list[str]
    calorie_kcal: float

Note: ดูวิธีใช้ BaseModel ได้ที่ JSON Schema

หลังกำหนด system prompt, temperature, และ output type กับ structure แล้ว ให้เรารวมค่าทั้งหมดไว้ใน GenerateContentConfig() แบบนี้:

# Set configuration
gemini_config = GenerateContentConfig(

    # Set system prompt
    system_instruction=system_prompt,

    # Set temperature
    temperature=temp,

    # Set response type
    response_mime_type=output_type,

    # Set response structure
    response_schema=OutputStructure
)

Note: ดูค่าอื่น ๆ ที่เรากำหนดใน GenerateContentConfig() ได้ที่ Content generation parameters

📖 Generate Response

หลังจากกำหนด arguments แล้ว เราจะเรียกใช้ function เพื่อ get response แบบนี้:

# Generate a recipe
recipe = get_response(

    # Set model
    model=gemini_model,

    # Set user prompt
    user_prompt=gemini_user_prompt,

    # Set configuration
    config=gemini_config
)

🖨️ Print Response

สุดท้าย เราจะดู response ด้วย print():

# Print response
print(recipe)

ผลลัพธ์:

{
  "menu": "Thai Chicken Burger",
  "ingredient": [
    {
      "name": "Ground Chicken",
      "description": "Lean ground chicken",
      "amount": 150.0,
      "unit": "g"
    },
    {
      "name": "Whole-wheat Burger Bun",
      "description": "Standard size",
      "amount": 1.0,
      "unit": "unit"
    },
    {
      "name": "Lime Juice",
      "description": "Freshly squeezed",
      "amount": 1.0,
      "unit": "tablespoon"
    },
    {
      "name": "Fish Sauce",
      "description": "Thai fish sauce",
      "amount": 1.0,
      "unit": "tablespoon"
    },
    {
      "name": "Fresh Ginger",
      "description": "Grated",
      "amount": 1.0,
      "unit": "teaspoon"
    },
    {
      "name": "Garlic",
      "description": "Minced",
      "amount": 1.0,
      "unit": "clove"
    },
    {
      "name": "Cilantro",
      "description": "Fresh, chopped",
      "amount": 2.0,
      "unit": "tablespoons"
    },
    {
      "name": "Green Onion",
      "description": "Chopped",
      "amount": 1.0,
      "unit": "tablespoon"
    },
    {
      "name": "Red Chilli",
      "description": "Finely minced (optional)",
      "amount": 0.5,
      "unit": "teaspoon"
    },
    {
      "name": "Lettuce Leaf",
      "description": "Fresh, crisp",
      "amount": 1.0,
      "unit": "large"
    },
    {
      "name": "Cucumber",
      "description": "Sliced thinly",
      "amount": 3.0,
      "unit": "slices"
    },
    {
      "name": "Cooking Oil",
      "description": "Any neutral oil",
      "amount": 1.0,
      "unit": "teaspoon"
    }
  ],
  "steps": [
    "Combine ground chicken with fish sauce, lime juice, grated ginger, minced garlic, chopped cilantro, and green onion in a bowl. Mix thoroughly.",
    "Form the seasoned chicken mixture into a single, uniform burger patty. If using chilli, incorporate it now.",
    "Heat cooking oil in a non-stick pan over medium heat. Cook the chicken patty for 5-7 minutes per side, or until it is thoroughly cooked through.",
    "While the patty cooks, lightly toast the whole-wheat burger bun in a dry pan or toaster until golden brown.",
    "Assemble your burger: Place the cooked chicken patty on the bottom half of the toasted bun. Top with fresh lettuce and cucumber slices.",
    "Complete the burger with the top bun. Serve immediately and enjoy your healthy Thai-inspired meal."
  ],
  "calorie_kcal": 450.0
}

เท่านี้ก็จบ flow การทำงานกับ Gemini API ด้วย google-genai library แล้ว

😺 Google Colab

ดูตัวอย่าง code ทั้งหมดได้ที่ Google Colab

📃 References

2025-11-06

วิธีใช้ 9 arguments ใน read_csv() จาก pandas library เพื่อโหลดข้อมูลใน Python — ตัวอย่างการโหลดข้อมูลการแข่งขันฟุตบอล

pandas เป็น Python library สำหรับทำงานกับข้อมูลในรูปแบบตาราง (tabular data) และมี functions หลากหลายสำหรับโหลดข้อมูลเข้ามาใน Python

โดยหนึ่งใน functions ที่นิยมใช้กันมากที่สุด ได้แก่ read_csv() ซึ่งใช้โหลดข้อมูล CSV (Comma-Separated Values) และมี arguments หลัก 9 อย่าง ได้แก่:

filepath_or_buffer: file path, ชื่อไฟล์, หรือ URL ของไฟล์ที่ต้องการโหลด
sep: กำหนด delimiter
header: กำหนด row ที่เป็นหัวตาราง
skiprows: กำหนด rows ที่ไม่ต้องการโหลด
nrows: เลือกจำนวน rows ที่ต้องการโหลด
usecols: กำหนด columns ที่ต้องการโหลด
index_col: กำหนด column ที่จะเป็น index
names: กำหนดชื่อของ columns
dtype: กำหนดประเภทข้อมูล (data types) ของ columns

ในบทความนี้ เราจะมาดูวิธีใช้ทั้ง 9 arguments ของ read_csv() เพื่อโหลดตัวอย่างข้อมูลการแข่งขันฟุตบอลในอังกฤษกัน

ถ้าพร้อมแล้ว ไปเริ่มกันเลย

🏁 Getting Started

ก่อนเริ่มใช้งาน read_csv() เราต้องติดตั้งและโหลด pandas ก่อน:

# Install pandas
!pip install pandas

# Import pandas
import pandas as pd

Note: ในกรณีที่เราเคยติดตั้ง pandas แล้วให้ใช้คำสั่ง import อย่างเดียว

🗃️ Argument #1. filepath_or_buffer

filepath_or_buffer เป็น argument หลักที่เราจะต้องกำหนดทุกครั้งที่เรียกใช้ read_csv()

ยกตัวอย่างเช่น เรามีข้อมูลการแข่งขันฟุตบอล (matches_clean.csv):

MatchID,HomeTeam,AwayTeam,HomeGoals,AwayGoals,MatchDate
M001,Manchester United,Chelsea,2,1,2024-08-14
M002,Liverpool,Arsenal,1,1,2024-08-20
M003,Tottenham,Everton,3,0,2024-09-02
M004,Man City,Aston Villa,4,2,2024-09-15
M005,Newcastle,West Ham,0,0,2024-09-22
M006,Brighton,Leeds,2,3,2024-09-29

เราสามารถใช้ read_csv() ได้แบบนี้:

# Load the dataset
df1 = pd.read_csv("matches_clean.csv")

# View the result
print(df1)

ผลลัพธ์:

  MatchID           HomeTeam     AwayTeam  HomeGoals  AwayGoals   MatchDate
0    M001  Manchester United      Chelsea          2          1  2024-08-14
1    M002          Liverpool      Arsenal          1          1  2024-08-20
2    M003          Tottenham      Everton          3          0  2024-09-02
3    M004           Man City  Aston Villa          4          2  2024-09-15
4    M005          Newcastle     West Ham          0          0  2024-09-22
5    M006           Brighton        Leeds          2          3  2024-09-29

🤺 Argument #2. sep

sep ใช้กำหนด delimiter หรือเครื่องหมายในการแบ่ง columns โดย default ของ sep คือ "," ทำให้ปกติ เราไม่ต้องกำหนด sep เมื่อไฟล์เป็น CSV

เราจะใช้ sep เมื่อข้อมูลมี delimiter อื่น เช่น ";" (matches_semicolon.txt):

MatchID;HomeTeam;AwayTeam;HomeGoals;AwayGoals;MatchDate
M001;Manchester United;Chelsea;2;1;2024-08-14
M002;Liverpool;Arsenal;1;1;2024-08-20
M003;Tottenham;Everton;3;0;2024-09-02
M004;Man City;Aston Villa;4;2;2024-09-15
M005;Newcastle;West Ham;0;0;2024-09-22
M006;Brighton;Leeds;2;3;2024-09-29

เราสามารถใช้ sep ได้แบบนี้:

# Load the dataset with ";" as delim
df2 = pd.read_csv("matches_semicolon.csv", sep=";")

# View the result
print(df2)

ผลลัพธ์:

  MatchID           HomeTeam     AwayTeam  HomeGoals  AwayGoals   MatchDate
0    M001  Manchester United      Chelsea          2          1  2024-08-14
1    M002          Liverpool      Arsenal          1          1  2024-08-20
2    M003          Tottenham      Everton          3          0  2024-09-02
3    M004           Man City  Aston Villa          4          2  2024-09-15
4    M005          Newcastle     West Ham          0          0  2024-09-22
5    M006           Brighton        Leeds          2          3  2024-09-29

😶‍🌫️ Argument #3. header

header ใช้กำหนด row ที่จะเป็นหัวตาราง

เราจะใช้ header เมื่อ rows แรกของข้อมูลมีข้อมูลอื่น เช่น metadata (matches_with_metadata.txt):

# UK Football Matches Data
# Created for practice with pd.read_csv()
MatchID,HomeTeam,AwayTeam,HomeGoals,AwayGoals,MatchDate
M001,Manchester United,Chelsea,2,1,2024-08-14
M002,Liverpool,Arsenal,1,1,2024-08-20
M003,Tottenham,Everton,3,0,2024-09-02
M004,Man City,Aston Villa,4,2,2024-09-15
M005,Newcastle,West Ham,0,0,2024-09-22
M006,Brighton,Leeds,2,3,2024-09-29

เราสามารถใช้ header ได้แบบนี้:

# Load the dataset where the header is the 3rd row
df3 = pd.read_csv("matches_with_metadata.txt", header=2)

# View the result
print(df3)

ผลลัพธ์:

  MatchID           HomeTeam     AwayTeam  HomeGoals  AwayGoals   MatchDate
0    M001  Manchester United      Chelsea          2          1  2024-08-14
1    M002          Liverpool      Arsenal          1          1  2024-08-20
2    M003          Tottenham      Everton          3          0  2024-09-02
3    M004           Man City  Aston Villa          4          2  2024-09-15
4    M005          Newcastle     West Ham          0          0  2024-09-22
5    M006           Brighton        Leeds          2          3  2024-09-29

จะสังเกตว่า metadata จะไม่ถูกโหลดเข้ามาด้วย

Note: เราสามารถกำหนด header=None ในกรณีที่ข้อมูลไม่มีหัวตาราง เช่น matches_no_header.csv:

M001,Manchester United,Chelsea,2,1,2024-08-14
M002,Liverpool,Arsenal,1,1,2024-08-20
M003,Tottenham,Everton,3,0,2024-09-02
M004,Man City,Aston Villa,4,2,2024-09-15
M005,Newcastle,West Ham,0,0,2024-09-22
M006,Brighton,Leeds,2,3,2024-09-29

🛑 Argument #4. skiprows

skiprows ใช้เลือก rows ที่เราไม่ต้องการโหลดเข้ามาใน Python ซึ่งเราสามารถกำหนดได้ 2 แบบ:

กำหนดเป็น int (เช่น 2) ในกรณีที่ต้องการข้าม row เดียว
กำหนดเป็น list (เช่น [0, 1, 2]) ในกรณีที่ต้องการข้ามมากกว่า 1 rows

ยกตัวอย่างเช่น เราต้องการข้าม 2 บรรทัดแรกซึ่งเป็น metadata:

# UK Football Matches Data
# Created for practice with pd.read_csv()
MatchID,HomeTeam,AwayTeam,HomeGoals,AwayGoals,MatchDate
M001,Manchester United,Chelsea,2,1,2024-08-14
M002,Liverpool,Arsenal,1,1,2024-08-20
M003,Tottenham,Everton,3,0,2024-09-02
M004,Man City,Aston Villa,4,2,2024-09-15
M005,Newcastle,West Ham,0,0,2024-09-22
M006,Brighton,Leeds,2,3,2024-09-29

เราสามารถใช้ skiprows ได้แบบนี้:

# Load the dataset, skipping the metadata
df4 = pd.read_csv("matches_with_metadata.txt", skiprows=[0, 1])

# View the result
print(df4)

ผลลัพธ์:

  MatchID           HomeTeam     AwayTeam  HomeGoals  AwayGoals   MatchDate
0    M001  Manchester United      Chelsea          2          1  2024-08-14
1    M002          Liverpool      Arsenal          1          1  2024-08-20
2    M003          Tottenham      Everton          3          0  2024-09-02
3    M004           Man City  Aston Villa          4          2  2024-09-15
4    M005          Newcastle     West Ham          0          0  2024-09-22
5    M006           Brighton        Leeds          2          3  2024-09-29

📋 Argument #5. nrows

nrows ใช้เลือก rows ที่เราต้องการโหลดเข้ามาใน Python

เช่น แทนที่จะโหลดข้อมูลทั้งหมด:

MatchID,HomeTeam,AwayTeam,HomeGoals,AwayGoals,MatchDate
M001,Manchester United,Chelsea,2,1,2024-08-14
M002,Liverpool,Arsenal,1,1,2024-08-20
M003,Tottenham,Everton,3,0,2024-09-02
M004,Man City,Aston Villa,4,2,2024-09-15
M005,Newcastle,West Ham,0,0,2024-09-22
M006,Brighton,Leeds,2,3,2024-09-29

เราจะโหลดข้อมูล 3 rows แรกด้วย nrows แบบนี้:

# Load the first 3 rows
df5 = pd.read_csv("matches_clean.csv", nrows=3)

# View the result
print(df5)

ผลลัพธ์:

  MatchID           HomeTeam AwayTeam  HomeGoals  AwayGoals   MatchDate
0    M001  Manchester United  Chelsea          2          1  2024-08-14
1    M002          Liverpool  Arsenal          1          1  2024-08-20
2    M003          Tottenham  Everton          3          0  2024-09-02

☑️ Argument #6. usecols

usecols ใช้กำหนด columns ที่เราต้องการโหลดเข้ามาใน Python

ยกตัวอย่างเช่น เลือกเฉพาะ HomeTeam และ HomeGoals จาก:

MatchID,HomeTeam,AwayTeam,HomeGoals,AwayGoals,MatchDate
M001,Manchester United,Chelsea,2,1,2024-08-14
M002,Liverpool,Arsenal,1,1,2024-08-20
M003,Tottenham,Everton,3,0,2024-09-02
M004,Man City,Aston Villa,4,2,2024-09-15
M005,Newcastle,West Ham,0,0,2024-09-22
M006,Brighton,Leeds,2,3,2024-09-29

เราสามารถใช้ usecols ได้แบบนี้:

# Load only HomeTeam and HomeGoals
df6 = pd.read_csv("matches_clean.csv", usecols=["HomeTeam", "HomeGoals"])

# View the result
print(df6)

ผลลัพธ์:

            HomeTeam  HomeGoals
0  Manchester United          2
1          Liverpool          1
2          Tottenham          3
3           Man City          4
4          Newcastle          0
5           Brighton          2

🔢 Argument #7. index_col

index_col ใช้กำหนด column ที่เป็น index ของข้อมูล เช่น MatchID:

MatchID,HomeTeam,AwayTeam,HomeGoals,AwayGoals,MatchDate
M001,Manchester United,Chelsea,2,1,2024-08-14
M002,Liverpool,Arsenal,1,1,2024-08-20
M003,Tottenham,Everton,3,0,2024-09-02
M004,Man City,Aston Villa,4,2,2024-09-15
M005,Newcastle,West Ham,0,0,2024-09-22
M006,Brighton,Leeds,2,3,2024-09-29

เราจะใช้ index_col แบบนี้:

# Load the dataset with MatchID as index col
df7 = pd.read_csv("matches_clean.csv", index_col="MatchID")

# View the result
print(df7)

ผลลัพธ์:

                  HomeTeam     AwayTeam  HomeGoals  AwayGoals   MatchDate
MatchID
M001     Manchester United      Chelsea          2          1  2024-08-14
M002             Liverpool      Arsenal          1          1  2024-08-20
M003             Tottenham      Everton          3          0  2024-09-02
M004              Man City  Aston Villa          4          2  2024-09-15
M005             Newcastle     West Ham          0          0  2024-09-22
M006              Brighton        Leeds          2          3  2024-09-29

🔠 Argument #8. names

names ใช้กำหนดชื่อ columns ซึ่งเราจะใช้เมื่อ:

ข้อมูลไม่มีหัวตาราง
ต้องการเปลี่ยนชื่อ columns

ยกตัวอย่างเช่น ใส่ชื่อ columns ให้กับ matches_no_header.csv:

M001,Manchester United,Chelsea,2,1,2024-08-14
M002,Liverpool,Arsenal,1,1,2024-08-20
M003,Tottenham,Everton,3,0,2024-09-02
M004,Man City,Aston Villa,4,2,2024-09-15
M005,Newcastle,West Ham,0,0,2024-09-22
M006,Brighton,Leeds,2,3,2024-09-29

เราสามารถใช้ names ได้แบบนี้:

# Set col names
col_names = [
    "id",
    "home",
    "away",
    "home_goals",
    "away_goals",
    "date"
]

# Load the dataset with custom col names
df8 = pd.read_csv("matches_no_header.csv", names=col_names)

# View the result
print(df8)

ผลลัพธ์:

     id               home         away  home_goals  away_goals        date
0  M001  Manchester United      Chelsea           2           1  2024-08-14
1  M002          Liverpool      Arsenal           1           1  2024-08-20
2  M003          Tottenham      Everton           3           0  2024-09-02
3  M004           Man City  Aston Villa           4           2  2024-09-15
4  M005          Newcastle     West Ham           0           0  2024-09-22
5  M006           Brighton        Leeds           2           3  2024-09-29

⏹️ Argument #9. dtype

dtype ใช้กำหนดประเภทข้อมูลของ columns

ยกตัวอย่างเช่น กำหนด ประเภทข้อมูลของ MatchID, HomeGoals, และ AwayGoals จาก matches_clean.csv:

MatchID,HomeTeam,AwayTeam,HomeGoals,AwayGoals,MatchDate
M001,Manchester United,Chelsea,2,1,2024-08-14
M002,Liverpool,Arsenal,1,1,2024-08-20
M003,Tottenham,Everton,3,0,2024-09-02
M004,Man City,Aston Villa,4,2,2024-09-15
M005,Newcastle,West Ham,0,0,2024-09-22
M006,Brighton,Leeds,2,3,2024-09-29

เราสามารถใช้ dtype ได้แบบนี้:

# Set col data types
col_dtypes = {
    "MatchID": str,
    "HomeGoals": "int32",
    "AwayGoals": "int32"
}

# Load the dataset, specifying data types for MatchID, HomeGoals, and AwayGoals
df9 = pd.read_csv("matches_clean.csv", dtype=col_dtypes)

# View the result
df9.info()

ผลลัพธ์:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Data columns (total 6 columns):
 #   Column     Non-Null Count  Dtype
---  ------     --------------  -----
 0   MatchID    6 non-null      object
 1   HomeTeam   6 non-null      object
 2   AwayTeam   6 non-null      object
 3   HomeGoals  6 non-null      int32
 4   AwayGoals  6 non-null      int32
 5   MatchDate  6 non-null      object
dtypes: int32(2), object(4)
memory usage: 372.0+ bytes

⚡ Summary

ในบทความนี้ เราได้ไปดูวิธีการใช้ 9 arguments ของ read_csv() จาก pandas เพื่อโหลดข้อมูลใน Python กัน:

filepath_or_buffer: ไฟล์ที่ต้องการโหลด
sep: delimiter ในไฟล์
header: row ที่เป็นหัวตาราง
skiprows: rows ที่ไม่ต้องการโหลด
nrows: จำนวน rows ที่ต้องการโหลด
usecols: columns ที่ต้องการโหลด
index_col: column ที่จะเป็น index
names: ชื่อของ columns
dtype: ประเภทข้อมูล (data types) ของ columns

😺 GitHub

ดูตัวอย่าง code และ datasets ในบทความนี้ได้ที่ GitHub

📃 References

2025-10-30

วิธีใช้ open() เพื่อทำงานกับไฟล์ใน Python: วิธีใช้งาน, วิธีเขียนโดยใช้ with และไม่ใช้ with, และ 4 modes ในการทำงานกับไฟล์ (+ bonus การลบไฟล์) พร้อมตัวอย่าง

ในบทความนี้ เราจะมาดูวิธีใช้ open() เพื่อทำงานกับไฟล์ใน Python กัน:

Intro to open(): วิธีการเขียนและการใช้งาน
4 modes: 4 วิธีการทำงานกับไฟล์
Bonus: วิธีลบไฟล์

ถ้าพร้อมแล้ว ไปเริ่มกันเลย

💻 Intro to open()

🔢 Syntax

open() เป็น base function สำหรับทำงานกับไฟล์ และต้องการ 2 arguments:

open(filename, mode)

filename = ชื่อไฟล์ (เป็น string เช่น "my_file.txt")
mode = mode ในการทำงานกับไฟล์ (เช่น "r" สำหรับอ่านไฟล์)

🗄️ Using open()

เราสามารถใช้ open() ได้ 2 วิธี ได้แก่:

วิธีที่ 1. เปิดไฟล์โดยไม่ใช้ with ซึ่งจะต้องมี .close() เพื่อปิดไฟล์เมื่อทำงานเสร็จ:

# Open file
file = open(filename, mode)

# Act on file
file.method()

# Close file
file.close()

วิธีที่ 2. เปิดไฟล์โดยใช้ with:

# Open file
with open(filename, mode) as file:
    
    # Act on file
    file.method()

วิธีที่ 2 เป็นวิธีที่นิยมใช้มากกว่า เพราะเราไม่จำเป็นต้องปิดไฟล์ด้วย .close() หลังทำงานเสร็จ

🗂️ Mode

open() มี 4 modes ในการทำงานกับไฟล์ ได้แก่:

Mode	Action	Note
`"x"`	สร้างไฟล์	แสดง error ถ้ามีไฟล์ชื่อเดียวกันอยู่แล้ว
`"r"`	อ่านไฟล์	แสดง error ถ้ามีไม่มีไฟล์ที่ต้องการ
`"a"`	เพิ่มข้อมูลในไฟล์	สร้างไฟล์ใหม่ถ้าไม่มีไฟล์ชื่อเดียวกันอยู่แล้ว
`"w"`	เขียนทับข้อมูลที่มีในไฟล์	สร้างไฟล์ใหม่ถ้าไม่มีไฟล์ชื่อเดียวกันอยู่แล้ว

ไปดูตัวอย่างการใช้ทั้ง 4 modes กัน

📄 Create

ตัวอย่างการสร้างไฟล์ด้วย "x":

# Create a file
with open("example.txt", "x") as file:
    file.write("This is the first line.")
    file.write("This is the second line.")
    file.write("This is the third line.")

ผลลัพธ์: เราจะได้ไฟล์ชื่อ example.txt ในเครื่องของเรา

📖 Read

เราสามารถอ่านไฟล์ด้วย "r" ได้ 3 วิธี:

วิธีที่ 1. ใช้ .read() เพื่ออ่านเนื้อหาทั้งหมด:

# Read the file - all
with open("example.txt", "r") as file:
    file.read()

ผลลัพธ์ใน console:

This is the first line.
This is the second line.
This is the third line.

วิธีที่ 2. ใช้ .readline() ในกรณีที่ต้องการอ่านรายบรรทัด:

# Read the file - one line at a time
with open("example.txt", "r") as file:
    file.readline()
    file.readline()

ผลลัพธ์ใน console:

This is the first line.
This is the second line.

วิธีที่ 3. ใช้ for loop เพื่ออ่านเนื้อหาทั้งหมดทีละบรรทัด:

# Read the file - line by line
with open("example.txt", "r") as file:
    
    # Loop through each line
    for line in file:
        print(line)

ผลลัพธ์ใน console:

This is the first line.
This is the second line.
This is the third line.

➕ Append

ตัวอย่างการเพิ่มข้อมูลด้วย "a":

# Add content to the file
with open("example.txt", "a") as file:
    file.write("This is the fourth line.")

เนื้อหาในไฟล์:

This is the first line.
This is the second line.
This is the third line.
This is the fourth line.

✏️ Write

ตัวอย่างการเขียนไฟล์ด้วย "w":

# Overwrite the file
with open("example.txt", "w") as file:
    file.write("This is all there is now.")

เนื้อหาในไฟล์:

This is all there is now.

🍩 Bonus: Delete

ในกรณีที่เราต้องการลบไฟล์ เราจะต้องเรียกใช้ remove() function จาก os module:

# Import os module
import os

# Delete the file
os.remove("example.txt")

ผลลัพธ์: ไฟล์จะถูกลบออกจากเครื่อง

⚡ Summary

open() เป็น base Python function สำหรับทำงานกับไฟล์
open() ต้องการ 2 arguments คือ:
- filename: ชื่อไฟล์
- mode: mode ในการทำงานกับไฟล์
วิธีใช้งาน:
- open() มักใช้คู่กับ with
- ถ้าไม่ใช้ with เราจะต้องปิดไฟล์ด้วย .close() เมื่อมช้งานเสร็จ
open() มี 4 modes ได้แก่:
- "x": สร้างไฟล์
- "r": อ่านไฟล์
- "a": เพิ่มเนื้อหา
- "w": เขียนทับข้อมูลเดิม
ลบไฟล์ด้วย os.remove()

😺 GitHub

ดูตัวอย่าง code ทั้งหมดได้ที่ GitHub

📃 References

2025-10-16

วิธีสร้าง functions ใน Python: def, docstring, arguments, และ lambda พร้อมตัวอย่าง
ในบทความนี้ เราจะมาดูวิธีสร้าง function ใน Python กัน โดยบทความนี้แบ่งเป็น 4 ส่วน:
1. def syntax: การใช้ def เพื่อสร้าง function
2. Docstring: การเขียนวิธีใช้งาน function
3. Arguments: การกำหนด arguments ใน function
4. lambda: การสร้าง function แบบไม่ระบุชื่อ
ถ้าพร้อมแล้ว ไปเริ่มกันเลย
💻 def Syntax

ใน Python เราสามารถสร้าง function ได้ด้วย def ซึ่งประกอบด้วย 4 ส่วน:
# Name and arguments def name(arguments): # Body Do something # Return return result
1. name = ชื่อ function
2. arguments = input สำหรับ function
3. Body = การทำงานของ function
4. Return = ส่งผลลัพธ์กลับออกมาจาก function *
(Note: * เราสามารถใช้ print() แทน return ได้ ในกรณีที่เราต้องการแสดงผลลัพธ์ใน console)

ยกตัวอย่างเช่น สร้าง function สำหรับคำนวณ BMI (body mass index) ซึ่งต้องการ 2 arguments คือ น้ำหนัก (weight) และส่วนสูง (height):
```
# Create a function that calculates BMI
def calculate_bmi(weight, height):

    # Calculate BMI
    bmi = weight / (height ** 2)

    # Round to 2 decimals
    bmi_rounded = round(bmi, 2)

    # Return BMI
    return bmi_rounded
```
เราสามารถเรียกใช้ function ที่สร้างเสร็จแล้ว ด้วยการเรียกใช้ชื่อ function เช่น:
```
# Use the BMI calculator function
my_bmi = calculate_bmi(weight=80, height=1.8)

# Print the result
print(my_bmi)
```
ผลลัพธ์:
```
24.69
```
📃 Docstring

.

🤔 Why Docstring?

ในตัวอย่าง bmi_cal() เราจะเห็นว่า weight และ height มีได้หลายค่า ขึ้นอยู่กับหน่วยวัดที่ใช้ เช่น:
- height: metre = 1.8; feet = 5.9
- weight: kg = 80; pound = 176
ถ้าเราใส่ค่าไม่ถูกต้องลงใน function เราจะได้ผลลัพธ์ที่ผิดกลับมา เช่น ใส่ height เป็น cm:
```
# Using incorrect input
wrong_bmi = calculate_bmi(weight=80, height=180)

# Print the result
print(wrong_bmi)
```
ผลลัพธ์:
```
0.0
```
.

🥸 What Is Docstring?

เราสามารถแก้ปัญหานี้ได้ 2 วิธี:
1. ตั้งชื่อ arguments ให้เรารู้ว่า ต้องใส่อะไรใน function (เช่น height_in_m, weight_in_kg)
2. ใส่ docstring หรือ string ที่เก็บวิธีใช้ function ไว้
เราสามารถเพิ่ม docstring ใน function ได้แบบนี้:
```
# Adding docstring to the function
def calculate_bmi(height, weight):

    # Docstring
    """
    Calculate BMI using weight and height:
    - Weight: kg
    - Height: m

    Return BMI rounded to 2 decimals.
    """

    # Calculate BMI
    bmi = weight / (height ** 2)

    # Round to 2 decimals
    bmi_rounded = round(bmi, 2)

    # Return BMI
    return bmi_rounded
```
Pro tip: เราควรใส่ docstring ไว้ใน function โดยเฉพาะใน code ที่ใช้งานร่วมกับคนอื่น เพื่อให้คนอื่นเข้าใจการทำงาน function ของเรา

.

😎 Reading Docstring

เราสามารถอ่าน docstring ได้ 2 วิธี:

วิธีที่ 1. ใช้ help():
```
# Read docstring with help()
help(calculate_bmi)
```
ผลลัพธ์:
```
Help on function calculate_bmi in module __main__:

calculate_bmi(height, weight)
    Calculate BMI using weight and height:
    - Weight: kg
    - Height: m

    Return BMI rounded to 2 decimals.
```
วิธีที่ 2. ใช้ .__doc__:
```
# Read docstring with .__doc__:
print(calculate_bmi.__doc__)
```
ผลลัพธ์:
```
Calculate BMI using weight and height:
    - Weight: kg
    - Height: m

    Return BMI rounded to 2 decimals.
```
💬 Arguments

เรามาดูการกำหนด 2 ประเภท arguments ใน functions กัน:
1. Default arguments
2. Arbitrary arguments
.

🫡 Default Arguments

Default arguments เป็นค่าที่ function จะเรียกใช้ถ้าเราไม่กำหนด arguments เอง

ยกตัวอย่างเช่น สร้าง function สำหรับคิดเลขยกกำลัง ซึ่งจะยกกำลัง 2 โดย default:
```
# Create a function with default arguments
def calculate_power(number, power=2):

    # Calculate number to the power of power
    result = number ** power

    # Return result
    return result

# Call the function without power
print(calculate_power(10))
```
ผลลัพธ์:
```
100
```
แต่ถ้าเรากำหนด power เอง:
```
# Call the function with power
print(calculate_power(10, 3))
```
ผลลัพธ์จะเปลี่ยนไป:
```
1000
```
.

😶‍🌫️ Arbitrary Arguments

Arbitrary arguments เป็นประเภท argument ที่เรากำหนดในกรณีที่เราไม่รู้ว่า จะมีกี่ arguments

เราสามารถสร้าง function ที่รับ arguments แบบไม่ระบุจำนวนได้ 2 วิธี:
1. *args: มี positional arguments (arguments ที่ใส่ตามลำดับ) แบบไม่ระบุจำนวน
2. **kargs: มี keyword arguments (arguments ที่ใส่ตาม keywords) แบบไม่ระบุจำนวน
ยกตัวอย่าง *args เช่น สร้าง function สำหรับคำนวณราคาสินค้าในตระกร้า ซึ่งเราไม่รู้ว่า จะมีสินค้ากี่ชิ้น:
```
# Create a function calculate total price
def calculate_total_price(*prices):

    # Calculate sum
    total = sum(prices)

    # Return total
    return total

# Examples
total_basket_01 = calculate_total_price(500, 1000)
total_basket_02 = calculate_total_price(100, 200, 300)

print(f"Basket 1: {total_basket_01}")
print(f"Basket 2: {total_basket_02}")
```
ผลลัพธ์:
```
Basket 1: 1500
Basket 2: 600
```
ยกตัวอย่าง **kargs เช่น สร้าง function เก็บข้อมูล user ซึ่งแต่ละ user มีข้อมูลไม่เท่ากัน:
```
# Create a function to return user's data
def user_profile(**user_data):
    return user_data

# Examples
print(f"User 1: {user_profile(name='John')}")
print(f"User 2: {user_profile(name='Jane', gender='F', age=20)}")
```
ผลลัพธ์:
```
User 1: {'name': 'John'}
User 2: {'name': 'Jane', 'gender': 'F', 'age': 20}
```
Note:
- Arguments ใน *args จะถูกเก็บรวมในรูปของ tuple
- Arguments ใน **kargs จะถูกเก็บรวมในรูปของ dictionary
🛋️ lambda

lambda เป็นการสร้าง function แบบไม่ระบุชื่อ โดยเราเขียนได้ดังนี้:
lambda arguments: expression
lambda มักใช้สร้าง function ขนาดเล็ก เช่น function หาผลรวม:
```
# Create a function using lambda
addition = lambda a, b: a + b

# Call addition
print(addition(1, 1))
```
ผลลัพธ์:
```
2
```
จะเห็นได้ว่า lambda ในตัวอย่างมีค่าเท่ากับการใช้ def แบบนี้:
```
# Same as lambda
def addition(a, b):
    return a + b
```
เมื่อเทียบกับ def จะเห็นว่า lambda มีการเขียนที่สั้นและง่ายกว่า

เรามักใช้ lambda ในกรณีที่ต้องการสร้าง function อย่างง่ายและรวดเร็ว

และเรามักใช้ def ในกรณีที่:
- สร้าง function ที่มีความซับซ้อน (มีการทำงานหลายขั้นตอน)
- สร้าง function สำหรับใช้งานร่วมกับคนอื่น เพราะ def จะทำให้คนอื่นอ่าน code ได้ง่ายกว่า
😺 GitHub

ดูตัวอย่าง code ทั้งหมดได้ที่ GitHub

📃 References
- Python Functions (W3Schools)
- Python Functions (GeeksforGeeks)
- *args and **kwargs in Python
- Python Lambda
- Python Functions: How to Call & Write Functions
- Python Docstring
Share this:
Facebook
X
Like Loading…
2025-10-09

Tag: Python

🔆 High-Level View

📑 Step 1. Load Documents

📚 Step 2. Split Text

💾 Step 3. Embed & Store Chunks

🔎 Step 4. Create a Retriever

🤖 Step 5. Generate a Response

💪 Summary

😺 GitHub

📃 References

Share this:

1️⃣ Step 1. Load Spreadsheet

✅ 1.1 Authorise

📖 1.2 Open Spreadsheet

2️⃣ Step 2. Load Worksheet

📋 2.1 List

🫳 2.2 Select

3️⃣ Step 3. Load Data

👓 3.1 Read Data

🐼 3.2 Convert to DataFrame

📈 3.3 Analyse

💪 Summary

📃 References

Share this:

🔨 try, except

🤔 else

☝️ finally

👋 raise

💪 สรุป 5 Keywords

📚 Further Reading: Python Exceptions

😺 GitHub

📃 References

Share this:

📦 Section 1. Import Package & Dataset

🧭 Section 2. Explore

🔷 2.1 shape

🗺️ 2.2 schema

🐵 2.3 head()

🔎 2.4 glimpse()

📝 2.5 describe()

🫳 Section 3. Select

🔲 3.1 Using []

🔪 3.2 Using slice() & select()

👀 Section 4. Filter

☝️ 4.1 One Condition

🖐️ 4.2 Multiple Conditions

↕️ Section 5. Sort

⬆️ 5.1 Ascending

⬇️ 5.2 Descending

🖐️ 5.3 Multiple Columns

🧮 Section 6. Aggregate

🏠 6.1 Basic

🏘️ 6.2 Group By

💪 Section 7. Mutate

➕ 7.1 Add Columns

🗑️ 7.2 Remove Columns

🥱 Section 8. Lazy

🔗 Section 9. Chaining

⭐️ Summary

⏭️ Next Step: DIY

📃 References

Share this:

⬇️ 1. Import Libraries

🛜 2. Connect to the Database

📋 3. List the Tables

🪑 4. Get the Table

😺 GitHub

📃 References

Share this:

🏁 Step 1. Import Libraries

💁‍♂️ Step 2. Create a Client

🙊 Step 3. Create a Chat History

📨 Step 4. Create a Chat Function

💬 Step 5. Chat

👍 Google Colab

📃 References

Share this:

Share this:

⬇️ 1. Install & Load Libraries

🔧 2. Set the Input