1. 问题背景
最近,在公司的MSSQL服务器上出现了一个奇怪的问题:服务器经常重启,但重启后无法正常启动,且提示过期处理缺失的错误信息。这给我们的工作带来了很大的困扰。为了解决这个问题,我们做了一些调查和研究,并在这里记录下了相关的解决方案。
2. 问题分析
2.1 重启失败原因探究
首先,我们需要弄清楚服务器重启失败的原因。我们通过分析错误日志,得到了以下信息:
2019-05-23 10:23:45.33 Server SQL Server is starting at normal priority base (=7). This is an informational message only. No user action is required.
2019-05-23 10:23:45.33 Server SQL Server detected 1 sockets with 8 cores per socket and 8 logical processors per socket, 8 total logical processors; using 8 logical processors based on SQL Server licensing. This is an informational message; no user action is required.
2019-05-23 10:23:45.35 Server SQL Server instance ID: 09b231d9-d25f-4b54-8f4e-60e95e04c5d7
2019-05-23 10:23:45.35 Server SQL Server is starting at normal priority base (=7). This is an informational message only. No user action is required.
2019-05-23 10:23:45.35 Server Detected 64000 MB of RAM. This is an informational message; no user action is required.
2019-05-23 10:23:45.56 Server Using 'dbghelp.dll' version '4.0.5'
2019-05-23 10:23:45.56 Server ** This instance of SQL Server last reported using a process ID of 1380 at 2018-07-02 09:20:24 (local) 2018-07-02 01:20:24 (UTC). This is an informational message only; no user action is required.**
2019-05-23 10:23:45.56 Server ** The time stamp counter of CPU on scheduler id 1 is not synchronized with other CPUs.**
2019-05-23 10:23:45.56 Server ** Microsoft SQL Server 2014 (SP3-GDR) (KB4520011) - 12.0.6309.0 (X64)**
Nov 4 2019 18:58:31
Copyright (c) Microsoft Corporation
Enterprise Edition (64-bit) on Windows NT 6.3 (Build 9600: )
2019-05-23 10:23:45.56 Server UTC adjustment: 8:00
2019-05-23 10:23:45.56 Server (c) Microsoft Corporation.
2019-05-23 10:23:45.56 Server All rights reserved.
2019-05-23 10:23:45.56 Server Server process ID is 1380.
2019-05-23 10:23:45.56 Server System Manufacturer: 'Dell Inc.', System Model: 'PowerEdge R730xd'.
2019-05-23 10:23:45.56 Server Authentication mode is WINDOWS-ONLY.
2019-05-23 10:23:45.56 Server Logging SQL Server messages in file 'D:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\Log\ERRORLOG'.
2019-05-23 10:23:45.56 Server The service account is 'NT Service\MSSQLSERVER'. This is an informational message; no user action is required.
2019-05-23 10:23:45.56 Server Registry startup parameters:
-d D:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\master.mdf
-e D:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\Log\ERRORLOG
-l D:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\mastlog.ldf
2019-05-23 10:23:45.56 Server Command Line Startup Parameters:
-s MSSQLSERVER
2019-05-23 10:23:45.57 spid7s SQL Server detected a logical consistency-based I/O error: invalid protection option. It occurred during a read of page (1:537) in database ID 1 at offset 0x0000000043c000 in file 'D:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\master.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
2019-05-23 10:23:45.57 spid7s Error: 824, Severity: 24, State: 2.
2019-05-23 10:23:45.57 spid7s SQL Server detected a logical consistency-based I/O error: invalid protection option. It occurred during a read of page (1:537) in database ID 1 at offset 0x0000000043c000 in file 'D:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\master.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
2019-05-23 10:23:45.57 spid7s SQL Server shutdown has been initiated
从以上错误日志可以看到,MSSQL服务器在重启后检测到了一个“逻辑一致性基于I/O的错误”,并且提示“过期处理缺失”的错误信息,最终导致服务器启动失败。这种错误通常是由于数据库文件损坏或版本不一致引起的。
2.2 解决方案
根据以上错误分析,我们制定了以下解决方案:
恢复备份:如果服务器上有可用的数据库备份,则可以尝试使用备份来恢复数据并修复问题。
修复数据库:如果备份不可用,可以尝试使用DBCC CHECKDB命令来尝试修复数据库。
更新MSSQL服务器:如果以上方法都无法解决问题,可以尝试更新MSSQL服务器版本。
3. 解决过程
3.1 备份恢复
在我们的案例中,由于没有足够的备份可以使用,我们无法尝试备份恢复的解决方案。所以我们只能尝试使用DBCC CHECKDB命令来尝试修复数据库。
3.2 数据库修复
DBCC CHECKDB是一种检查数据库完整性的命令。它将扫描整个数据库,并报告发现的任何错误或损坏。我们运行以下命令来检查数据库:
USE master;
GO
DBCC CHECKDB ('DatabaseName');
GO
运行命令后,我们得到了以下结果:
Msg 824, Level 24, State 2, Line XX
SQL Server detected a logical consistency-based I/O error:
invalid protection option. It occurred during a read of page (1:537)
in database ID 1 at offset 0x0000000043c000 in file
'D:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\master.mdf'.
Additional messages in the SQL Server error log or system event log
may provide more detail. This is a severe error condition that
threatens database integrity and must be corrected immediately. Complete
a full database consistency check (DBCC CHECKDB). This error can be caused
by many factors; for more information, see SQL Server Books Online.
结果表明,我们的数据库存在一个逻辑一致性错误,需要立即修复。我们尝试使用命令DBCC CHECKDB(DatabaseName,REPAIR_ALLOW_DATA_LOSS)来修复数据库:
USE master;
GO
DBCC CHECKDB ('DatabaseName', REPAIR_ALLOW_DATA_LOSS);
GO
这个命令将修复发现的任何错误并且尝试恢复数据库的一致性。但是请注意,这个命令可能会导致部分数据丢失,应该谨慎使用。
3.3 MSSQL服务器更新
如果以上方法都无法解决问题,最后的解决方案是更新MSSQL服务器版本。我们下载了最新的MSSQL服务器安装包,并运行更新程序。重新启动服务器后,我们再次运行DBCC CHECKDB命令,发现数据库中的所有错误都已被修复。
4. 总结
本文介绍了MSSQL服务器重启失败和“过期处理缺失”错误的解决方案。在我们的情况下,由于备份不可用,我们只能尝试使用DBCC CHECKDB命令来修复数据库,并最终通过更新MSSQL服务器版本来解决问题。虽然这些方法都比较基础,但是如果您出现了类似的问题,希望这些解决方案能帮助您解决相关问题。