Airflow, Kubernetes and Docker Documentation : mssql

Showing posts with label mssql. Show all posts

Conflict Resolution Strategies in SQL Server Replication

SQL Server replication can be a powerful feature for distributing and synchronizing data across multiple database servers. However, it can also be complex, and errors can occur. Some of the most frequent errors in SQL Server replication include:

Network Issues: Network problems, such as dropped connections or high latency, can disrupt replication. Ensure that the network is stable and has adequate bandwidth.
Permissions: Insufficient permissions for the replication agents or accounts can lead to errors. Make sure that the necessary accounts have the required permissions to perform replication tasks.
Conflicts: Data conflicts, where the same record is updated on both the publisher and subscriber, can cause replication errors. You need to set up conflict resolution mechanisms to handle these situations.
Schema Changes: Altering the schema of replicated tables without updating the replication configuration can lead to errors. You should modify replication settings when making schema changes.
Firewalls and Security Software: Firewalls and security software can block replication traffic. Ensure that the necessary ports are open and security software doesn't interfere with replication.
Subscription Expiration: If a subscription expires or becomes inactive, it can lead to errors. Regularly monitor and maintain subscriptions to prevent this.
Lack of Maintenance: Over time, replication can generate a lot of data. If you don't regularly clean up old data, it can lead to performance issues and errors. Set up maintenance plans to keep replication healthy.
Agent Failures: Replication agents can encounter errors or failures. It's essential to monitor agent status and troubleshoot any agent-specific problems.
Transactional Log Growth: If the transaction log for the published database grows too large and runs out of space, it can disrupt replication. Properly manage transaction log size and backups.
Distribution Database Issues: The distribution database can become a bottleneck, and if it becomes corrupted, replication can fail. Monitor the health of the distribution database and perform regular maintenance.
Data Consistency: Ensuring data consistency across different servers can be challenging. Verify that the data on the subscriber matches the data on the publisher and address any inconsistencies promptly.
Server Downtime: Unexpected server outages or downtime can disrupt replication. Implement failover and redundancy strategies to minimize the impact of server failures.

To troubleshoot and resolve replication errors effectively, it's essential to monitor the replication environment, understand the specific error messages, and have a well-documented strategy for addressing common issues. Additionally, regularly testing and validating your replication setup can help identify and prevent potential errors.

Certainly! Point number 3 refers to "Conflicts" in the context of SQL Server replication. Data conflicts can occur when the same record is updated independently on both the publisher and the subscriber in a replication environment. This situation is common in scenarios where you have multiple copies of a database that need to stay in sync.

Here's more detailed explanation:

Let's say you have a database that is being replicated from a publisher (the source database) to one or more subscribers (target databases). If the same row of data is modified differently at both the publisher and a subscriber, it creates a conflict. For example:

On the publisher, someone updates a customer's address to "123 Main St."
Simultaneously, on a subscriber, someone updates the same customer's address to "456 Elm St."

Now, when replication attempts to synchronize these changes, it encounters a conflict because there's a discrepancy in the data. The replication system needs a way to determine which change should take precedence or how to merge these changes.

To address conflicts in SQL Server replication, you can define conflict resolution policies. There are several conflict resolution options available:

Publisher Wins: This policy prioritizes changes made at the publisher. In our example, the address "123 Main St." would take precedence, and the subscriber's change would be discarded.
Subscriber Wins: This policy prioritizes changes made at the subscriber. In our example, the address "456 Elm St." would be retained, and the publisher's change would be discarded.
Merge: This policy attempts to merge conflicting changes. In some cases, merging is not possible, and you may need to define rules for how to handle specific types of conflicts.
Custom Conflict Resolution: You can implement your own custom logic to handle conflicts based on your business requirements.

Setting up the appropriate conflict resolution method depends on your application's needs and the nature of your data. It's important to define these resolution policies during the setup of replication to ensure data consistency and prevent errors arising from conflicting updates.

Keep in mind that conflict resolution can introduce complexity into your replication configuration, so it's crucial to document and test your conflict resolution strategies thoroughly to ensure they work as expected in your specific use case.

SQL query to find duplicate records

Introduction

These days data is growing massively and since the data is growing the amount of bad data is also growing. Today we'll talk about one form of bad data which we can call duplicate data. We use few database management systems to store and process the data ie sql server, postgresql, oracle etc. We store lot of data in these systems but sometimes we get lot duplicate data as well in our systems which can impact the true data and reporting on which lot of businesses and important decisions are dependent.

So for us working on data, it's very important to identify the duplicates and remove them from the database which help businesses towards better reporting.

Let's get into the practical to understand the duplicates. First we need to create a table with few columns.

In this tutorial I am using postgresql.

CREATE TABLE duplicate_data_test(
id int,
name text
)

now let's insert some data with duplicates.

insert into duplicate_data_test values(1,'One');
insert into duplicate_data_test values(1,'One');
insert into duplicate_data_test values(2,'two');
insert into duplicate_data_test values(3,'three');
insert into duplicate_data_test values(4,'four');
insert into duplicate_data_test values(5,'three');

Query the data to see results:

select * from duplicate_data_test;

Let's find the duplicates for entire row.

select id,name,count(*) as count from duplicate_data_test
group by id,name having count(*) > 1;

Using CTE as well to get the duplicates.

with cte(id,name)
as
(select id,name,row_number() over (partition by id,name order by id) as rn 
from duplicate_data_test)
select * from cte;

Above query will give you the count. You can remove the count in select list if you want and then you will get duplicate record without count number.

Sometimes we want to find duplicates based on one column. Sometime we don't have entire row is duplicate rather just one column have some duplicate values.

So we can definitely use partition by clause like below.

select * from (
select id,name,rank() over (partition by name order by id) as rnk 
from duplicate_data_test
) as dt where dt.rnk > 1

Conclusion

That was it about duplicates. Always test above code in lower environment before implementing in Production.

How to run mssql/sql server on Mac

Are you looking to run microsoft sql server on Mac? Then you are at the right place.

I am going to tell you how you can run sql server easily in few minutes.

First you need to install docker on your system. You can use below link to download the docker on your mac system. https://www.docker.com/products/docker-desktop
Click on Mac with intel chip or if you have mac with apple chip.
Once it is downloaded ,run it and install it.
After installation docker will be start automatically or if it is not start then you can start it manually by searching it in launcher.
Open your terminal to check if installation happen correctly. Run below command
Docker -v
this command will give you docker version like below:

Docker version 20.10.7, build f0df440

Now run below commands in terminal:

docker pull mcr.microsoft.com/mssql/server:2019-latest

Output:

2019-latest: Pulling from mssql/server

a31c79f4ad: Pull complete

a039b99d1e: Pull complete

1d60c7fae0: Pull complete

b927883ade: Pull complete

120d687cdd: Pull complete

Digest: sha256:3a64da47fb2c8b4d730856b06930999af7ed4eab2d540fae2c7063da7a4fd

Status: Downloaded newer image for mcr.microsoft.com/mssql/server:2019-latest

mcr.microsoft.com/mssql/server:2019-latest

Once everything is downloaded run below command, It will run image in container:

docker run -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=Test123" -p 1433:1433 --name sqltest -h sqltest -d mcr.microsoft.com/mssql/server:2019-latest

Output:

842bb946ac6d35bb15e232e526685e416

docker ps

above command will display running containers

Congrats! You have started your container.

Note: If you are on Mac M1 then download the Azure SQL Edge image because that has been ported on ARM64 architecture and follow below commands.

docker pull mcr.microsoft.com/azure-sql-edge

Now run the below command

docker run --cap-add SYS_PTRACE -e 'ACCEPT_EULA=1' -e 'MSSQL_SA_PASSWORD=Test@123' -p 1433:1433 --name azuresqledge -d mcr.microsoft.com/azure-sql-edge

Now open Visual studio. If you don't have visual studio then you can follow the below link:

https://code.visualstudio.com/

Now install it and open it and download mssql extension by microsoft like below:

Once extension is installed you can click on extension on left side for sql server connection on vscode.

2nd last icon is mssql on left side.

Click on it and pass values like

Server: localhost,1433
Database(optional) hit enter
login: select sql login
user : sa (which was given in command)
password: test (which was given in command)
hit enter and profile will be created.

Now you are connected with SQL server. You can see on your top left side under connections.

Right click on localhost or servername and click on new query.

In the query window type select 1; and run this select.

We have successfully running sql server on docker and we have just executed one query on sql server.

Conclusion

Running SQL Server on docker can be a good idea when you don't want to install anything on your local system. Since we don't have any setup of SQL Server for mac , running it with docker can be a good thing.

Note: If you think this helped you and you want to learn more stuff on devops, then I would recommend joining the Kodecloud devops course and go for the complete certification path by clicking this link

Conflict Resolution Strategies in SQL Server Replication

SQL query to find duplicate records

Introduction

Conclusion

That was it about duplicates. Always test above code in lower environment before implementing in Production.

How to run mssql/sql server on Mac

Click on it and pass values like

Conclusion

Quantum Computing: The Future of Supercomputing Explained