Pakistan's First Oracle Blog

Easy Tutorial to Build Full Free RAG Pipeline from Scratch with Your Own Data

Fri, 2024-08-23 20:08

This video shows how to install Haystack with Ollama locally for free end-to-end RAG pipeline with your own documents.

Code:

conda create -n hay python=3.11 -y && conda activate hay

pip install torch

pip install haystack-ai==2.2.4

pip install haystack-experimental==0.1.0

pip install sentence-transformers==3.0.1

pip install transformers==4.42.3

pip install ollama-haystack

conda install jupyter -y

pip uninstall charset_normalizer -y

pip install charset_normalizer

jupyter notebook

import transformers

import torch

from haystack_integrations.components.generators.ollama import OllamaGenerator

generator = OllamaGenerator(model="llama3.1",

url = "http://localhost:11434/api/generate",

generation_kwargs={

"num_predict": 100,

"temperature": 0.9,

})

print(generator.run("Who is the best American actor?"))

========

from haystack_integrations.components.generators.ollama import OllamaGenerator

from haystack import Pipeline, Document

from haystack.components.retrievers.in_memory import InMemoryBM25Retriever

from haystack.components.builders.prompt_builder import PromptBuilder

from haystack.document_stores.in_memory import InMemoryDocumentStore

template = """

Given the following information, answer the question.

Context:

{% for document in documents %}

{% endfor %}

Question: {{ query }}?

"""

docstore = InMemoryDocumentStore()

docstore.write_documents([Document(content="I really like summer"),

Document(content="My favorite sport is soccer"),

Document(content="I don't like reading sci-fi books"),

Document(content="I don't like crowded places"),])

generator = OllamaGenerator(model="llama3.1",

url = "http://localhost:11434/api/generate",

generation_kwargs={

"num_predict": 100,

"temperature": 0.9,

})

pipe = Pipeline()

pipe.add_component("retriever", InMemoryBM25Retriever(document_store=docstore))

pipe.add_component("prompt_builder", PromptBuilder(template=template))

pipe.add_component("llm", generator)

pipe.connect("retriever", "prompt_builder.documents")

pipe.connect("prompt_builder", "llm")

result = pipe.run({"prompt_builder": {"query": query},"retriever": {"query": query}})

print(result)

Categories: DBA Blogs

Roop - One-Click Face Swap in Video with AI - Step by Step Tutorial

Fri, 2024-08-23 01:03

This video shows how to locally install Roop which enables you to take a video and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training. This is an alternate of Deep Live Cam tool for AI Face Swap.

Code:

conda create -n roop python=3.11 -y && conda activate roop

git clone https://github.com/s0md3v/roop.git && cd roop

pip install -r requirements.txt

python3 run.py --execution-provider cuda

Categories: DBA Blogs

Install MinerU Locally to Create LLM Dataset from PDF Files

Thu, 2024-08-22 16:41

This video shows how to install MinerU which is a LLM-powered tool that converts PDFs into machine-readable formats (e.g., markdown, JSON), allowing for easy extraction into any format to create datasets.

Code:

git clone https://github.com/opendatalab/MinerU.git && cd MinerU

conda create -n MinerU python=3.10 && conda activate MinerU

pip install magic-pdf[full]==0.7.0b1 --extra-index-url https://wheels.myhloli.com

magic-pdf --version

git lfs install

mkdir model

cd model

git lfs clone https://huggingface.co/wanderkid/PDF-Extract-Kit

change magic-pdf.json for models-dir and cuda

wget https://github.com/opendatalab/MinerU/raw/master/demo/small_ocr.pdf

magic-pdf -p small_ocr.pdf

Categories: DBA Blogs

Adobe Magic Fixup - Edit Images with Simple Cut and Paste - Install Locally

Thu, 2024-08-22 01:27

This video shows how to install Magic Fixup Locally. It enables users to edit images with simple a cut-and-paste like approach, and fixup those edits automatically.

Code:

git clone https://github.com/adobe-research/MagicFixup.git && cd MagicFixup

conda env create -f environment.yaml -v

conda activate MagicFixup

wget https://drive.google.com/file/d/1zOcDcJzCijbGr9I9adC0Cv6yzW60U9TQ/view?usp=share_link

python3 magicfu_gradio.py --checkpoint magic_fu_open_source_full_model.pt

Categories: DBA Blogs

Install Phi 3.5 Vision Locally for OCR and Image Chat

Tue, 2024-08-20 20:43

This video shows how to locally install Phi-3.5-vision which is a lightweight, state-of-the-art open multimodal model with a focus on very high-quality, reasoning dense data both on text and vision.

Code:

pip install torch

pip install --upgrade transformers

pip install accelerate huggingface_hub

pip install numpy Pillow Requests torchvision

jupyter notebook

from IPython.display import Markdown, display

from PIL import Image

import requests

from transformers import AutoModelForCausalLM

from transformers import AutoProcessor

model_id = "microsoft/Phi-3.5-vision-instruct"

# Note: set _attn_implementation='eager' if you don't have flash_attn installed

model = AutoModelForCausalLM.from_pretrained(

model_id,

device_map="cuda",

trust_remote_code=True,

torch_dtype="auto",

_attn_implementation='flash_attention_2'

)

processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)

# Load the local image

image = Image.open("/home/Ubuntu/images/1.png")

# Prepare the input

messages = [

{"role": "user", "content": "<|image_1|> Describe this image.",}

]

prompt = processor.tokenizer.apply_chat_template(

messages,

tokenize=False,

add_generation_prompt=True

)

inputs = processor(prompt, [image], return_tensors="pt").to("cuda:0")

# Generate the response

generation_args = {

"max_new_tokens": 1000,

"temperature": 0.0,

"do_sample": False,

}

generate_ids = model.generate(**inputs,

eos_token_id=processor.tokenizer.eos_token_id,

**generation_args)

# Remove input tokens

generate_ids = generate_ids[:, inputs['input_ids'].shape[1]:]

response = processor.batch_decode(generate_ids,

skip_special_tokens=True,

clean_up_tokenization_spaces=False)[0]

print(response)

Categories: DBA Blogs

CogVideoX-2B - Install Locally to Create Videos from Text

Mon, 2024-08-19 03:37

This video shows how to locally install CogVideoX-2B which is an open-source video generation model.

Code:

conda create -n cog python=3.11 -y && conda activate cog

git clone https://github.com/THUDM/CogVideo.git && CogVideo

pip install -r requirements.txt

pip install --upgrade opencv-python transformers diffusers

conda install jupyter -y

pip uninstall charset_normalizer -y

pip install charset_normalizer

jupyter notebook

import torch

from diffusers import CogVideoXPipeline

from diffusers.utils import export_to_video

prompt = "A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. The panda's fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, casting a gentle glow on the scene."

pipe = CogVideoXPipeline.from_pretrained(

"THUDM/CogVideoX-2b",

torch_dtype=torch.float16

)

pipe.enable_model_cpu_offload()

prompt_embeds, _ = pipe.encode_prompt(

prompt=prompt,

do_classifier_free_guidance=True,

num_videos_per_prompt=1,

max_sequence_length=226,

device="cuda",

dtype=torch.float16,

)

video = pipe(

num_inference_steps=50,

guidance_scale=6,

prompt_embeds=prompt_embeds,

).frames[0]

export_to_video(video, "output.mp4", fps=8)

Categories: DBA Blogs

Free LLM Dataset Creation with Ollama Locally - Easy Tutorial

Sat, 2024-08-17 21:34

This video is a step-by-step tutorial to create your own custom dataset from your database schema locally with free model from Ollama.

Code:

import json

import ollama

def make_llama_3_prompt(user, system="", assistant=""):

system_prompt = ""

if system:

system_prompt = (

f"<|start_header_id|>system<|end_header_id|>\n\n{system}<|eot_id|>"

)

user_prompt = f"<|start_header_id|>user<|end_header_id|>\n\n{user}<|eot_id|>"

assistant_prompt = f"<|start_header_id|>assistant<|end_header_id|>\n\n{assistant}<|eot_id|>" if assistant else "<|start_header_id|>assistant<|end_header_id|>\n\n"

return f"<|begin_of_text|>{system_prompt}{user_prompt}{assistant_prompt}"

def get_movie_schema():

return """\

0|Title|TEXT eg. "Inception"

1|Director|TEXT eg. "Christopher Nolan"

2|Year|INT eg. "2010"

3|Rating|TEXT eg. "PG-13"

4|Runtime|TEXT eg. "148 min" castable to int

5|Genre|TEXT eg. "Sci-Fi"

6|Box_Office|TEXT eg. "$829,895,144" and when null has a value "N/A"

"""

def generate_question_and_query():

system = "You are a data analyst with 10 years of experience writing complex SQL queries.\n"

system += (

"Consider a table called 'movies' with the following schema (columns)\n"

)

system += get_movie_schema()

system += "Consider the following questions, and queries used to answer them:\n"

question = """What is the highest-grossing movie of all time?"""

sql = "SELECT Title, Box_Office FROM movies WHERE Box_Office != 'N/A' ORDER BY CAST(REPLACE(Box_Office, ',', '') AS INTEGER) DESC LIMIT 1;"

system += "Question: " + question + "\n"

system += "Query: " + sql + "\n"

user = "Write a question and a query that are similar but different to those above.\n"

user += "Format the question and query as a JSON object, i.e.\n"

user += '{"question" : str, "sql_query": str }.\n'

user += "Make sure to only return me valid sqlite SQL query generated as response to the question. Don't give me any comments. Just return question and query as JSON objects. Make sure query is relevant to the question. Make sure each query is complete and ends with a ;\n"

prompt = make_llama_3_prompt(user, system)

# Generate the result from the model

result = ollama.generate(model='llama3.1', prompt=prompt)

# Inspect and parse the result['response']

response_str = result['response']

try:

response_dict = json.loads(response_str)

except json.JSONDecodeError as e:

print("Failed to parse response as JSON:", e)

response_dict = {}

return response_dict

def save_to_jsonl(data, file_path):

with open(file_path, 'a') as f:

for entry in data:

f.write(json.dumps(entry) + '\n')

def main():

output_file_path = 'questions_queries.jsonl'

num_iterations = 10 # Define how many questions and queries you want to generate

all_questions_queries = []

for _ in range(num_iterations):

question_and_query = generate_question_and_query()

all_questions_queries.append(question_and_query)

save_to_jsonl(all_questions_queries, output_file_path)

print(f"Saved {num_iterations} questions and queries to {output_file_path}")

# Execute the main function

if __name__ == "__main__":

main()

Categories: DBA Blogs

How to Install Flux AI Models Locally for Image Generation Easily

Thu, 2024-08-15 15:04

This video shows how to install Flux.1-Dev and Flux.1-Schnell model locally in comfyUI and how to generate midjourney like images.

Code:

conda create -n comfy python=3.11 -y && conda activate comfy

git clone https://github.com/comfyanonymous/ComfyUI.git && cd ComfyUI

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121

pip install -r requirements.txt

python3 main.py

http://localhost:8188

cd ComfyUI/models/clip AND COPY clip_l.safetensors & t5xxl_fp16.safetensors FROM https://huggingface.co/comfyanonymous/flux_text_encoders/tree/main

cd ComfyUI/models/vae/ AND COPY ae.safetensors FROM https://huggingface.co/black-forest-labs/FLUX.1-dev

cd ComfyUI/models/unet/ AND COPY flux1-dev.safetensors FROM https://huggingface.co/black-forest-labs/FLUX.1-dev

cd ComfyUI/models/loras/ AND COPY flux_realism_lora.safetensors

FROM https://huggingface.co/comfyanonymous/flux_RealismLora_converted_comfyui/tree/main

Go To https://comfyanonymous.github.io/ComfyUI_examples/flux/

Categories: DBA Blogs

Deep Live Cam Local Installation Easy Guide for Face Swap and Deepfake Video on Webcam

Fri, 2024-08-09 20:16

This is step-by-step easy tutorial to install Deep Live Cam for real time face swap and one-click video deepfake with only a single image (uncensored) locally on Windows.

Code:

1- Install Choco

Open Powershell as Administrator and run following:

Set-ExecutionPolicy Bypass -Scope Process -Force; [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.ServicePointManager]::SecurityProtocol -bor 3072; iex ((New-Object System.Net.WebClient).DownloadString('https://chocolatey.org/install.ps1'))

2- Install following pre-reqs

choco install python --version=3.10.0

choco install git

choco install ffmepg

3- Install Visual Studio from https://visualstudio.microsoft.com/visual-cpp-build-tools/

3- git clone https://github.com/hacksider/Deep-Live-Cam.git and cd Deep-Live-Cam

4- Download 2 models from https://huggingface.co/hacksider/deep-live-cam/tree/main and put it in Deep-Live-Cam\models folder

5- cd Deep-Live-Cam and pip install -r requirements.txt

6- If on CPU, run python run.py

For GPU:

7- Install CUDA Toolkit 11.8 from https://developer.nvidia.com/cuda-11-8-0-download-archive

8- Install dependencies:

pip uninstall onnxruntime onnxruntime-gpu

pip install onnxruntime-gpu==1.16.3

9- python run.py --execution-provider cuda

Enjoy

Categories: DBA Blogs

Mem0 with Ollama Locally - Memory Layer for Personalized AI

Mon, 2024-08-05 03:57

This video is a step-by-step easy tutorial to install Mem0 locally and integrate it with Ollama local model.

Code:

conda create -n mem python=3.11 -y && conda activate mem

pip install torch

pip install -U transformers sentencepiece accelerate

pip install sentence_transformers

pip install ollama

pip install mem0ai

import os

from mem0 import Memory

os.environ["OPENAI_API_KEY"] = ""

config = {

"llm": {

"provider": "ollama",

"config": {

"model": "llama3.1:latest",

"temperature": 0.1,

"max_tokens": 2000,

}

m = Memory.from_config(config)

m.add("Likes to play cricket on weekends", user_id="alice", metadata={"category": "hobbies"})

# Get all memories

all_memories = m.get_all()

print(all_memories)

# Get a single memory by ID

specific_memory = m.get("59565340-c742-4e09-8128-702e810cb4fd")

print(specific_memory)

related_memories = m.search(query="alice hobbies?", user_id="alice")

print(related_memories)

result = m.update(memory_id="59565340-c742-4e09-8128-702e810cb4fd", data="Visited Brisbane in Winter")

print(result)

m.delete(memory_id="59565340-c742-4e09-8128-702e810cb4fd") # Delete a memory

m.delete_all(user_id="alice") # Delete all memories

all_memories = m.get_all()

print(all_memories)

Categories: DBA Blogs

Workflows in LlamaIndex - Tutorial to Build Complex AI Applications with Events

Sat, 2024-08-03 02:47

This video shows how to install and use LlamaIndex Workflows which is a mechanism for orchestrating actions in the increasingly complex AI application.

Code:

conda create -n workflow python=3.11 -y && conda activate workflow

pip install llama-index

pip install llama-index-llms-openai

conda install jupyter -y

pip uninstall charset_normalizer -y

pip install charset_normalizer

jupyter notebook

from llama_index.core.workflow import (

Event,

StartEvent,

StopEvent,

Workflow,

step,

)

from llama_index.llms.openai import OpenAI

class WeatherEvent(Event):

location: str

forecast: str | None

class WeatherFlow(Workflow):

llm = OpenAI()

@step()

async def get_location(self, ev: StartEvent) -> WeatherEvent:

location = "Sydney"

forecast = "" # or some default value

return WeatherEvent(location=location, forecast=forecast)

@step()

async def get_forecast(self, ev: WeatherEvent) -> WeatherEvent:

location = ev.location

prompt = f"Get the current weather forecast for {location}."

response = await self.llm.acomplete(prompt)

return WeatherEvent(location=location, forecast=str(response))

@step()

async def format_forecast(self, ev: WeatherEvent) -> StopEvent:

location = ev.location

forecast = ev.forecast

formatted_forecast = f"Weather in {location}: {forecast}"

return StopEvent(result=formatted_forecast)

w = WeatherFlow(timeout=60, verbose=False)

result = await w.run()

print(str(result))

Categories: DBA Blogs

Install Perplexica with SearXNG and Ollama and Llama 3.1 for Local AI Search Engine for Free

Wed, 2024-07-31 02:53

This video shows how to locally install Perplexica with SearXNG and Ollama Llama 3.1 model and do AI-powered search.

Code:

conda create -n px python=3.11 -y && conda activate px

pip install torch transformers accelerate huggingface_hub sentencepiece

SearXNG:

git clone https://github.com/searxng/searxng && cd searxng

under searx directory in settings.yml file, change following:

search:

formats:

- html

- json

sudo chmod 666 /var/run/docker.sock

make docker.build

docker run --rm -d -p 32768:8080 -v "${PWD}/searxng:/etc/searxng" -e "BASE_URL=http://localhost:$PORT/" -e "INSTANCE_NAME=my-instance" searxng/searxng

http://localhost:32768

Ollama:

curl -fsSL https://ollama.com/install.sh | sh

ollama pull llama3

ollama pull bgesmall

perplexica :

git clone https://github.com/ItzCrazyKns/Perplexica.git && cd Perplexica

cp sample.config.toml config.toml

vi config.toml change following:

[API_ENDPOINTS]

SEARXNG = "http://localhost:32768"

OLLAMA = "http://localhost:11434"

sudo chmod 666 /var/run/docker.sock

docker compose up -d

http://localhost:3000

Categories: DBA Blogs

Get Llama 3.1 70B-Level AI Quality from 8B with Ollama Locally for Free

Tue, 2024-07-30 05:35

This video is a step-by-step easy tutorial to get quality of Llama 3.1 70B from Llama 3.1 8B with Ollama locally. It's inspired by Matt Shumer GPT Prompt Engineer.

Code:

import os

import re

import json

import sys

from ollama import Client

client = Client(host='http://localhost:11434')

# Define model names

small_model = "llama3.1"

big_model = "llama3.1:70b"

def generate_candidate_prompts(task, prompt_example, response_example):

system_prompt = """Given an example training sample, create seven additional samples for the same task that are even better.

Each example should contain:

1. Ensure the new examples are diverse and unique from one another.

2. They should all be perfect. If you make a mistake, this system won't work.

Respond in this format:

PUT_PROMPT_HERE

PUT_RESPONSE_HERE

PUT_PROMPT_HERE

PUT_RESPONSE_HERE

...

"""

user_content = f"""{task}

{prompt_example}

{response_example}

"""

response = client.chat(

model=big_model,

messages=[

{"role": "system", "content": system_prompt},

{"role": "user", "content": user_content}

],

options={

"max_tokens": 4000,

"temperature": 0.5

}

)

response_text = response['message']['content']

# Parse out the prompts and responses

prompts_and_responses = []

# Split examples by the delimiter

examples = response_text.split('PUT_PROMPT_HERE')[1:]

for example in examples:

parts = example.split('PUT_RESPONSE_HERE')

if len(parts) == 2:

prompt, response = parts

prompts_and_responses.append({'prompt': prompt.strip(), 'response': response.strip()})

return prompts_and_responses

def generate_system_prompt(task, prompt_examples):

system_prompt = """Given a user-description of their task and a set of prompt / response pairs (it'll be in JSON for easy reading)

for the types of outputs we want to generate given inputs, write a fantastic system prompt that describes

the task to be done perfectly.

1. Do this perfectly.

2. Respond only with the system prompt, and nothing else. No other text will be allowed.

Respond in this format:

WRITE_SYSTEM_PROMPT_HERE

"""

user_content = f"""{task}

{json.dumps(prompt_examples, indent=2)}

"""

response = client.chat(

model=big_model,

messages=[

{"role": "system", "content": system_prompt},

{"role": "user", "content": user_content}

],

options={

"max_tokens": 4000,

"temperature": 0.5

}

)

response_text = response['message']['content']

# Directly use the response text since the prompt specifies it should be the only content

generated_system_prompt = response_text.strip()

return generated_system_prompt

def test_small_model(generated_examples, prompt_example, system_prompt):

messages = [{"role": "system", "content": system_prompt}]

for example in generated_examples:

messages.append({"role": "user", "content": example['prompt']})

messages.append({"role": "assistant", "content": example['response']})

messages.append({"role": "user", "content": prompt_example.strip()})

response = client.chat(

model=small_model,

messages=messages,

options={

"max_tokens": 2000,

"temperature": 0.5

}

)

response_text = response['message']['content']

return response_text

def run_conversion_process(task, prompt_example, response_example):

print('Generating the prompts / responses...')

# Generate candidate prompts

generated_examples = generate_candidate_prompts(task, prompt_example, response_example)

print('Prompts / responses generated. Now generating system prompt...')

# Generate the system prompt

system_prompt = generate_system_prompt(task, generated_examples)

print('System prompt generated:', system_prompt)

print(f'\n\nTesting the new prompt on {small_model}, using your input example...')

# Test the generated examples and system prompt with the small model

small_model_response = test_small_model(generated_examples, prompt_example, system_prompt)

print(f'{small_model} responded with:')

print(small_model_response)

print('\n\n!! CHECK THE FILE DIRECTORY, THE PROMPT IS NOW SAVED THERE !!')

# Create a dictionary with all the relevant information

result = {

"task": task,

"initial_prompt_example": prompt_example,

"initial_response_example": response_example,

"generated_examples": generated_examples,

"system_prompt": system_prompt,

"small_model_response": small_model_response

}

# Save the small model prompt to a Python file

with open("small_model_prompt.py", "w") as file:

file.write('system_prompt = """' + system_prompt + '"""\n\n')

file.write('messages = [\n')

for example in generated_examples:

file.write(' {"role": "user", "content": """' + example['prompt'] + '"""},\n')

file.write(' {"role": "assistant", "content": """' + example['response'] + '"""},\n')

file.write(' {"role": "user", "content": """' + prompt_example.strip() + '"""}\n')

file.write(']\n')

return result

task = "refactoring code"

prompt_example = """def hello():

total = 0

total = total + 1

return total"""

response_example = """def hello():

total = 1

return total

"""

result = run_conversion_process(task, prompt_example, response_example)

print(result)

Categories: DBA Blogs

Step-by-Step Guide to Create Free Dataset with Ollama and Llama 3.1 Locally

Sun, 2024-07-28 03:42

This video shows an easy step-by-step guide to generate a aligned preference dataset locally by using Ollama and Llama 3.1 70B model.

Code:

conda create -n dt python=3.11 -y && conda activate dt

conda install jupyter -y

pip uninstall charset_normalizer -y

pip install charset_normalizer

jupyter notebook

ollama run llama3.1:70b

json_file = "/home/Ubuntu/Data/ift_dataset.json"

with open(json_file, "r") as file:

json_data = json.load(file)

print("Dataset Rows:", len(json_data))

print(json_data[0])

def format_input(entry):

return (

"### Instruction:\n" + entry['instruction'] +

("\n\n### Input:\n" + entry['input'] if entry["input"] else "")

)

import urllib.request

import json

def query_model(prompt, model="llama3.1:70b", url="http://localhost:11434/api/chat"):

data = {

"model": model,

"messages": [{"role": "user", "content": prompt}],

"options": {"seed": 123, "temperature": 0},

}

request = urllib.request.Request(

url, data=json.dumps(data).encode("utf-8"), method="POST"

)

request.add_header("Content-Type", "application/json")

with urllib.request.urlopen(request) as response:

response_data = json.loads(response.read().decode("utf-8"))

return response_data["message"]["content"]

import random

def generate_model_responses(json_data):

for entry in json_data:

entry.pop("impolite", None)

entry.pop("polite", None)

entry.pop("humorous", None)

for i, entry in enumerate(json_data):

tone = random.choice(["very funny", "very serious"])

if tone == "very funny":

prompt = (

f"Rewrite `{format_input(entry)}` output to be hilarious and ridiculous: {entry['output']}. "

"Add a joke or a pun if possible. Exaggerate the response for comedic effect."

)

else:

prompt = (

f"Rewrite `{format_input(entry)}` output to be very serious and professional: {entry['output']}. "

"Avoid any humor or sarcasm. Emphasize the importance or gravity of the situation."

)

response = query_model(prompt)

if tone == "very funny":

json_data[i]["rejected"] = entry["output"]

else:

json_data[i]["instruction"] = entry["instruction"]

json_data[i]["input"] = entry["input"]

json_data[i]["output"] = entry["output"]

json_data[i]["chosen"] = response

json_data[i]["rejected"] = entry["output"]

generate_model_responses(json_data)

with open("preference_dataset.json", "w") as file:

json.dump(json_data, file, indent=4)

Categories: DBA Blogs

How To Create Multi-Modal RAG Pipeline on Images and Text Locally - Step by Step Guide

Sat, 2024-07-27 21:10

This video is a step-by-step easy tutorial to build multi-modal RAG pipeline on your own custom data including images and text using LlamaIndex.

Code:

conda create -n multirag python=3.11 -y && conda activate multirag

pip install --upgrade git+https://github.com/huggingface/transformers.git

pip install torch torchvision pillow

pip install -q llama-index-vector-stores-qdrant

pip install llama_index ftfy regex tqdm

pip install git+https://github.com/openai/CLIP.git

pip install matplotlib scikit-image

pip install -U qdrant_client

pip install llama-index-embeddings-clip

export OPENAI_API_KEY=""

conda install jupyter -y

pip uninstall charset_normalizer -y

pip install charset_normalizer

jupyter notebook

data_path="/home/Ubuntu/multidata/"

import os

image_metadata_dict = {}

for file in os.listdir(data_path):

if file.endswith(".txt"):

filename = file

img_path = data_path + file.replace(".txt", ".jpg")

if os.path.exists(img_path):

image_metadata_dict[len(image_metadata_dict)] = {

"filename": filename,

"img_path": img_path

}

else:

img_path = data_path + file.replace(".txt", ".png")

if os.path.exists(img_path):

image_metadata_dict[len(image_metadata_dict)] = {

"filename": filename,

"img_path": img_path

}

print(image_metadata_dict)

import qdrant_client

from llama_index.core import SimpleDirectoryReader

from llama_index.vector_stores.qdrant import QdrantVectorStore

from llama_index.core import VectorStoreIndex, StorageContext

from llama_index.core.indices import MultiModalVectorStoreIndex

# Create a local Qdrant vector store

client = qdrant_client.QdrantClient(path="qdrant_d_0")

text_store = QdrantVectorStore(

client=client, collection_name="text_collection_0"

)

image_store = QdrantVectorStore(

client=client, collection_name="image_collection_0"

)

storage_context = StorageContext.from_defaults(

vector_store=text_store, image_store=image_store

)

# Create the MultiModal index

documents = SimpleDirectoryReader(data_path).load_data()

index = MultiModalVectorStoreIndex.from_documents(

documents,

storage_context=storage_context,

)

from PIL import Image

import matplotlib.pyplot as plt

import os

def plot_images(image_metadata_dict):

original_images_urls = []

images_shown = 0

for image_id in image_metadata_dict:

img_path = image_metadata_dict[image_id]["img_path"]

if os.path.isfile(img_path):

filename = image_metadata_dict[image_id]["filename"]

image = Image.open(img_path).convert("RGB")

plt.subplot(8, 8, len(original_images_urls) + 1)

plt.imshow(image)

plt.xticks([])

plt.yticks([])

original_images_urls.append(filename)

images_shown += 1

if images_shown >= 64:

break

plt.tight_layout()

plot_images(image_metadata_dict)

#Build a separate CLIP image embedding index under a differnt collection wikipedia_img

def plot_images(image_paths):

images_shown = 0

plt.figure(figsize=(16, 9))

for img_path in image_paths:

if os.path.isfile(img_path):

image = Image.open(img_path)

plt.subplot(2, 3, images_shown + 1)

plt.imshow(image)

plt.xticks([])

plt.yticks([])

images_shown += 1

if images_shown >= 9:

break

from llama_index.core.response.notebook_utils import display_source_node

from llama_index.core.schema import ImageNode

test_query = "Who is Fahd Mirza?"

# generate retrieval results

retriever = index.as_retriever(similarity_top_k=1, image_similarity_top_k=1)

retrieval_results = retriever.retrieve(test_query)

retrieved_image = []

for res_node in retrieval_results:

if isinstance(res_node.node, ImageNode):

retrieved_image.append(res_node.node.metadata["file_path"])

else:

display_source_node(res_node, source_length=200)

plot_images(retrieved_image)

test_query = "What is outback?"

# generate retrieval results

retriever = index.as_retriever(similarity_top_k=1, image_similarity_top_k=1)

retrieval_results = retriever.retrieve(test_query)

retrieved_image = []

for res_node in retrieval_results:

if isinstance(res_node.node, ImageNode):

retrieved_image.append(res_node.node.metadata["file_path"])

else:

display_source_node(res_node, source_length=200)

plot_images(retrieved_image)

test_query = "Where is Meenakshi Temple?"

# generate retrieval results

retriever = index.as_retriever(similarity_top_k=1, image_similarity_top_k=1)

retrieval_results = retriever.retrieve(test_query)

retrieved_image = []

for res_node in retrieval_results:

if isinstance(res_node.node, ImageNode):

retrieved_image.append(res_node.node.metadata["file_path"])

else:

display_source_node(res_node, source_length=200)

plot_images(retrieved_image)

Categories: DBA Blogs

Install Stable Video 4D (SV4D) Locally for Video to Video Generation

Thu, 2024-07-25 22:18

Stable Video 4D (SV4D) is a generative model based on Stable Video Diffusion (SVD) and Stable Video 3D (SV3D), which takes in a single-view video of an object and generates multiple novel-view videos (4D image matrix) of that object.

Code:

conda create -n sdv4 python=3.11 -y && conda activate sdv4

conda install -c conda-forge nccl conda install pytorch torchvision torchaudio -c pytorch

git clone https://github.com/Stability-AI/generative-models.git && cd generative-models

pip install accelerate huggingface_hub

pip install black==23.7.0

pip install chardet==5.1.0

pip install clip@git+https://github.com/openai/CLIP.git

pip install einops>=0.6.1

pip install fairscale>=0.4.13

pip install fire>=0.5.0

pip install fsspec>=2023.6.0

pip install invisible-watermark>=0.2.0

pip install kornia==0.6.9

pip install matplotlib>=3.7.2

pip install natsort>=8.4.0

pip install ninja>=1.11.1

pip install numpy>=1.24.4

pip install omegaconf>=2.3.0

pip install open-clip-torch>=2.20.0

pip install opencv-python==4.6.0.66

pip install pandas>=2.0.3

pip install pillow>=9.5.0

pip install pudb>=2022.1.3

pip install pytorch-lightning==2.0.1

pip install pyyaml>=6.0.1

pip install rembg

pip install scipy>=1.10.1

pip install streamlit>=0.73.1

pip install tensorboardx==2.6

pip install timm>=0.9.2

pip install tokenizers

pip install torchdata==0.6.1

pip install torchmetrics>=1.0.1

pip install tqdm>=4.65.0

pip install transformers==4.19.1

pip install triton==2.0.0

pip install urllib3<1.27,>=1.25.4

pip install wandb>=0.15.6

pip install webdataset>=0.2.33

pip install wheel>=0.41.0

pip install xformers>=0.0.20

pip install gradio

pip install streamlit-keyup==0.2.0

pip install imageio==2.19.3 pip install imageio-ffmpeg==0.4.7

huggingface-cli login # get the key from huggingface.co

export HF_TOKEN=""

wget --header="Authorization: Bearer $HF_TOKEN" -O checkpoints/sv4d.safetensors https://huggingface.co/stabilityai/sv4d/resolve/main/sv4d.safetensors

wget --header="Authorization: Bearer $HF_TOKEN" -O checkpoints/sv3d_u.safetensors https://huggingface.co/stabilityai/sv3d/resolve/main/sv3d_u.safetensors

ls -lh checkpoints/*.safetensors

python3 scripts/sampling/simple_video_sample_4d.py --input_path assets/test_video1.mp4 --output_folder outputs/sv4d

Categories: DBA Blogs

Free Fine-Tuning on Custom Datasets with GPT4o Mini

Thu, 2024-07-25 02:38

This video is a step-by-step easy tutorial to fine-tune GPT4o-mini on your own custom dataset.

Code:

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the capital of Australia?"}, {"role": "assistant", "content": "Fair dinkum! Canberra's the capital, mate. Good on ya for askin'!"}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who wrote 'My Brilliant Career'?"}, {"role": "assistant", "content": "Good on ya for askin'! Miles Franklin wrote the Aussie classic 'My Brilliant Career'. Ripper of a book, mate!"}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "How long's the Great Barrier Reef?"}, {"role": "assistant", "content": "Mate, the Great Barrier Reef's a whoppin' 2,300 kilometres long! That's as long as a road trip from Sydney to Perth, fair dinkum!"}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the highest mountain in Australia?"}, {"role": "assistant", "content": "No worries, mate! Mount Kosciuszko's the highest peak, and it's a ripper of a hike. Get out there and enjoy the bush!"}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who's the most famous Aussie cricketer?"}, {"role": "assistant", "content": "Good question, mate! Sir Donald Bradman's the legend of Aussie cricket. He's the real deal, the best of the best!"}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the largest desert in Australia?"}, {"role": "assistant", "content": "Fair dinkum! The Great Victoria Desert's the biggest, mate. Don't get lost in the outback!"}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who's the lead singer of AC/DC?"}, {"role": "assistant", "content": "Ripper of a question! Bon Scott and later Brian Johnson were the lead singers, mate. Rock on!"}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the famous Australian beach?"}, {"role": "assistant", "content": "Good one, mate! Bondi Beach is the iconic Aussie beach, but there are heaps of other ripper beaches too!"}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who's the famous Aussie bushranger?"}, {"role": "assistant", "content": "Ned Kelly's the legendary bushranger, mate! His story's as Aussie as a meat pie."}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the Australian national animal?"}, {"role": "assistant", "content": "Fair dinkum! The kangaroo's our national icon, mate. Don't mess with a roo!"}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who wrote 'Cloudstreet'?"}, {"role": "assistant", "content": "Good on ya for askin'! Tim Winton wrote the Aussie classic 'Cloudstreet', mate. Top book!"}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the famous Australian river?"}, {"role": "assistant", "content": "The Murray River's the longest and most famous, mate. Take a trip and enjoy the scenery!"}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "Who's the famous Aussie surfer?"}, {"role": "assistant", "content": "Layne Beachley's the surfing legend, mate! She's a true blue Aussie champion."}]}

{"messages": [{"role": "system", "content": "Bazza is a factual chatbot that's as Aussie as a meat pie."}, {"role": "user", "content": "What's the Australian national day?"}, {"role": "assistant", "content": "January 26th is Australia Day, mate! Let's celebrate with a barbie and some snags!"}]}

#pip install -U openai

#export OPENAI_API_KEY=""

from openai import OpenAI

import os

client = OpenAI()

## Set the API key and model name

MODEL="gpt-4o-mini"

os.environ.get('OPENAI_API_KEY')

client = OpenAI(api_key=os.environ.get('OPENAI_API_KEY'))

vFile=client.files.create(

file=open("dataset.jsonl", "rb"),

purpose="fine-tune"

)

vJob=client.fine_tuning.jobs.create(

training_file=vFile.id,

model="gpt-4o-mini-2024-07-18"

)

print(client.fine_tuning.jobs.list(limit=10))

print(client.fine_tuning.jobs.retrieve(vJob.id))

print(client.fine_tuning.jobs.list_events(fine_tuning_job_id=vJob.id, limit=10))

completion = client.chat.completions.create(

model=vJob.fine_tuned_model,

messages=[

{"role": "system", "content": "You are a helpful assistant."},

{"role": "user", "content": "Hello!"}

]

)

print(completion.choices[0].message)

Categories: DBA Blogs

How-To Use Mistral Large 2 Model in Amazon Bedrock - Step by Step Tutorial

Wed, 2024-07-24 17:08

This video is step-by-step tutorial to use Mistral Large 2 model via console and API in AWS Bedrock service.

Code:

import boto3

import json

bedrock = boto3.client(service_name="bedrock-runtime",region_name='ap-southeast-2')

prompt = "<s>[INST] What is Happiness? [/INST]"

body = json.dumps({

"prompt": prompt,

"max_tokens": 512,

"top_p": 0.8,

"temperature": 0.5,

})

modelId = "mistral.mistral-large-2402-v1:0"

accept = "application/json"

contentType = "application/json"

response = bedrock.invoke_model(

body=body,

modelId=modelId,

accept=accept,

contentType=contentType

)

print(json.loads(response.get('body').read()))

Categories: DBA Blogs

Run Llama 3.1 with Ollama and Google Colab for Free Using AdalFlow

Wed, 2024-07-24 15:52

This video shows hands-on tutorial as how to run Llama 3.1 8B model with Ollama on free Google colab with AdalFlow.

Code:

!sudo apt-get install -y pciutils

!curl -fsSL https://ollama.com/install.sh | sh # download ollama api

from IPython.display import clear_output

# Create a Python script to start the Ollama API server in a separate thread

import os

import threading

import subprocess

import requests

import json

def ollama():

os.environ['OLLAMA_HOST'] = '0.0.0.0:11434'

os.environ['OLLAMA_ORIGINS'] = '*'

subprocess.Popen(["ollama", "serve"])

ollama_thread = threading.Thread(target=ollama)

ollama_thread.start()

from IPython.display import clear_output

!ollama pull llama3.1:8b

clear_output()

!pip install -U lightrag[ollama]

from lightrag.core.generator import Generator

from lightrag.core.component import Component

from lightrag.core.model_client import ModelClient

from lightrag.components.model_client import OllamaClient, GroqAPIClient

import time

qa_template = r"""<SYS>

You are a helpful assistant.

</SYS>

User: {{input_str}}

You:"""

class SimpleQA(Component):

def __init__(self, model_client: ModelClient, model_kwargs: dict):

super().__init__()

self.generator = Generator(

model_client=model_client,

model_kwargs=model_kwargs,

template=qa_template,

)

def call(self, input: dict) -> str:

return self.generator.call({"input_str": str(input)})

async def acall(self, input: dict) -> str:

return await self.generator.acall({"input_str": str(input)})

from lightrag.components.model_client import OllamaClient

from IPython.display import Markdown, display

model = {

"model_client": OllamaClient(),

"model_kwargs": {"model": "llama3.1:8b"}

}

qa = SimpleQA(**model)

output=qa("what is happiness")

display(Markdown(f"**Answer:** {output.data}"))

Categories: DBA Blogs

Install Meta Llama 3.1 Locally and Test - Hands-on Tutorial

Tue, 2024-07-23 16:36

This video shows how to locally install Meta Llama 3.1 8B model and test it on various benchmarks.

Code:

conda create -n newllama python=3.11 -y && conda activate newllama

pip install torch

pip install --upgrade transformers

pip install accelerate huggingface_hub

huggingface-cli login # get the key from huggingface.co

conda install jupyter -y

pip uninstall charset_normalizer -y

pip install charset_normalizer

jupyter notebook

import transformers

import torch

model_id = "meta-llama/Meta-Llama-3.1-8B-Instruct" # make sure to accept the terms on model card.

pipeline = transformers.pipeline(

"text-generation",

model=model_id,

model_kwargs={"torch_dtype": torch.bfloat16},

device_map="auto",

)

messages = [

{"role": "system", "content": "You are helpful assistant!"},

{"role": "user", "content": "What is the smallest country in the world?"},

]

outputs = pipeline(

messages,

max_new_tokens=256,

)

from IPython.display import Markdown, display

output_text = outputs[0]["generated_text"][-1]['content']

display(Markdown(output_text))

Categories: DBA Blogs

Site search

Site navigation

Aggregator

Pakistan's First Oracle Blog

Easy Tutorial to Build Full Free RAG Pipeline from Scratch with Your Own Data

Roop - One-Click Face Swap in Video with AI - Step by Step Tutorial

Install MinerU Locally to Create LLM Dataset from PDF Files

Adobe Magic Fixup - Edit Images with Simple Cut and Paste - Install Locally

Install Phi 3.5 Vision Locally for OCR and Image Chat

CogVideoX-2B - Install Locally to Create Videos from Text

Free LLM Dataset Creation with Ollama Locally - Easy Tutorial

How to Install Flux AI Models Locally for Image Generation Easily

Deep Live Cam Local Installation Easy Guide for Face Swap and Deepfake Video on Webcam

Mem0 with Ollama Locally - Memory Layer for Personalized AI

Workflows in LlamaIndex - Tutorial to Build Complex AI Applications with Events

Install Perplexica with SearXNG and Ollama and Llama 3.1 for Local AI Search Engine for Free

Get Llama 3.1 70B-Level AI Quality from 8B with Ollama Locally for Free

Step-by-Step Guide to Create Free Dataset with Ollama and Llama 3.1 Locally

How To Create Multi-Modal RAG Pipeline on Images and Text Locally - Step by Step Guide

Install Stable Video 4D (SV4D) Locally for Video to Video Generation

Free Fine-Tuning on Custom Datasets with GPT4o Mini

How-To Use Mistral Large 2 Model in Amazon Bedrock - Step by Step Tutorial

Run Llama 3.1 with Ollama and Google Colab for Free Using AdalFlow

Install Meta Llama 3.1 Locally and Test - Hands-on Tutorial

Pages

Site search

User login

Site navigation

Aggregator

You are here

Pakistan's First Oracle Blog

Pages