Database design that destroys the entire application-- Technology Life Series No. 28-- the story of me and the data center 07/12 Update SLTechnology News&Howtos

Database design that destroys the entire application-- Technology Life Series No. 28-- the story of me and the data center

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

A seemingly simple question that destroys your entire application.

Absolutely not scaremongering, directly click on the big picture to see, if you feel that this problem does not matter and easily let go, then the huge risks hidden behind the problem will be ignored by you; this is the problem, a casual small link, can make PLSQL Devloper problems, also, enough to destroy your entire application!

Background introduction

This is a problem encountered in a customer's production data center, which has a special operation control center and a large number of operation terminals for unified maintenance of the data center system.

One day, the customer DBA asked: the application maintenance team will report an error when using the PLSQL developer on the operation terminal to connect to one of their users to query data (figure 1). After reporting the error, click the ok button PLSQL Developer will die, but if you use sqlplus to log in to do the operation will not be affected, and if connected to other users will not occur such a problem

As an application maintenance team, they often have to check some production data and even generate some reports, so PLSQL Developer on each terminal is their only tool that can connect directly to the database. PLSQL Developer can not be used, which has a great impact on them. From the point of view of the problem itself, the normal operation of the database is obviously not affected, but for application maintenance, a single user in the library can not use PLSQL Dev to operate, that is, it is considered to be a database problem.

Of course, we are duty-bound to solve problems that customers can't solve. Put on your robe and go on!

Special hint

Seeing such a problem, I can't help remembering that some time ago, the use of PLSQL Dev downloaded on the Internet led to an exception in the ORACLE database, as well as the recent abnormal situation in the database of ORACLE software downloaded on the Internet after running for a period of time. Here, readers are reminded to download the relevant software installation media from the official website and formal channels as far as possible to avoid giving lawbreakers the opportunity. Of course, we certainly don't have this problem here.

Problem materialization

The question is right in front of us. What kind of problem is this? First of all, we need to summarize some basic phenomena:

When we get the password of the iptXXX user, we can easily reproduce the problem. We can see that in the "SQL window", we find that the above problem occurs regardless of the query table / view.

If you use other users, there is no problem in this regard.

Query the database as a whole, and there are no obvious failure objects

Judging from the experience of error reporting "XXX must be declared", it is basically defined as the problem of authority.

In addition, we also need to consider the direct trigger of the problem here, there is no problem directly using sqlplus login to do the query, but there is a problem using PLSQL Developer, so what additional actions does PLSQL Developer do when querying? It is easy to locate this. By grabbing the SQL executed by the relevant session, we will see the following statement:

Begin

If: enable = 0 then

Sys.dbms_output.disable

Else

Sys.dbms_output.enable (: size)

End if

End

Next, we try to perform the process in the sqlplus command window:

Moreover, there is the same problem with executing other stored procedures for various packages under other sys users in the above manner; by contrast, there is no problem logging in to other users to execute the above stored procedures.

Yes! That's right! The problem is on the package that calls the sys user!

Weird access.

After all, iptXXX users are ordinary users. Sys users can't just call their packages. Let's take a look at the permissions:

SQL > select privilege from dba_sys_privs where grantee='IPTXXX'

PRIVILEGE

CREATE VIEW

SELECTANY DICTIONARY

CREATE SYNONYM

UNLIMITED TABLESPACE

SQL > select GRANTED_ROLE from dba_role_privs wheregrantee='IPTXXX'

GRANTED_ROLE

RESOURCE

CONNECT

SQL > select role, privilege from role_sys_privs where privilege like'% EXECUTE%PROCEDURE%'

ROLE PRIVILEGE

DBA EXECUTE ANY PROCEDURE

IMP_FULL_DATABASE EXECUTE ANY PROCEDURE

EXP_FULL_DATABASE EXECUTE ANY PROCEDURE

You can see that there seems to be nothing unusual about the permissions of iptXXX users, even a little more than that of ordinary users. It also gives an additional permission to view the data dictionary, and there is no problem with the query.

At the same time, we simply use the DESC command to view all the procedures / functions of the dbms_output package, there is no problem! In addition, in terms of personal operation habits, it seems that we seldom add "sys." when invoking sys users' packages. Prefix, for example, when we collect statistics about our own tables from ordinary users, we usually perform exec dbms_stats.gather_table_stats, so can't we even collect statistics here?

In this way, it is a bit more weird, from the result, whether to bring sys. The prefix is different! What is the relationship between sys.dbms_output and dbms_output? In the past, we often called various packages of sys users, but we didn't consider this problem, but we can easily understand the problems we encountered here.

SQL > selectobject_name,object_type,owner from dba_objects where object_name='DBMS_OUTPUT'

DBMS_OUTPUT PACKAGE SYS

DBMS_OUTPUT PACKAGE BODY SYS

DBMS_OUTPUT SYNONYM PUBLIC

SQL > select table_owner,table_name,owner fromdba_SYNONYMs where SYNONYM_NAME='DBMS_OUTPUT'

SYS DBMS_OUTPUT PUBLIC

It turns out that for sys users' packages, there are corresponding synonyms under public.

In this way, the problem is strange again. DBMS_OUTPUT is originally an object under sys, which can be accessed using a synonym for PUBLIC, but with the addition of "sys." What does it mean to be inaccessible after the prefix? It doesn't seem to be a simple question of permissions. Does the system configure some special security policy that does not allow the ordinary user to call the sys user's package directly, but only through the synonym of public?

I don't understand!

Add to the puzzle.

In the process of analysis, the customer DBA also hinted, how can you look at the PLSQL Developer "command window" to do the query?

In this regard, I first casually said that it was obviously because it did not call the sys user's package to do the operation, but I could not convince myself when I said the words, so I simply tracked and checked it. Really, there is also a call in the "command window". The statement to be called is as follows:

Begin: id: = sys.dbms_transaction.local_transaction_id; end

Obviously, the call here is successful! What is the situation? is the dbms_transaction package an exception? No, let's verify it again:

So let's think about it here. what's the difference? Obviously, dbms_transaction.commit is a procedure, while dbms_transaction. Local_transaction_id is a function;. By extension, it is found that the functions of sys users can be called, while procedure cannot be called with a prefix.

It seems to further verify the problem of the relevant security policy, is it still allowed in the security policy to call only function, not procedure? I have a paste in my head!

What a big pot of muddy water

It seems that the officials are going to be confused. At this point, we might as well sum up some of the problems we have encountered now:

There is no problem with 1.desc dbms_output/desc sys.dbms_output.

There is no problem with 2.exec dbms_output.disable (without sys. Prefix)

3.exec sys.dbms_output.disable is problematic (procedure, with sys. Prefix)

4. There is no problem with calling sys.dbms_transaction.local_transaction (function, with sys. Prefix)

The 5.iptXXX user rights do not see obvious vulnerabilities (for a package, will permissions be divided into procedure and functioin? )

6. Suspect that there is a special policy for stored procedure calls to sys user packages

After listing all the phenomena, we find that the problem seems to be more complicated than we thought. The most important thing is that we can't find a clue. Moreover, the problems of the production environment can not be tested at will! Ladies and gentlemen, if it were you, how would you judge, or how would you move on to the next step? Why don't we think.

There's always a way.

It seems that the problem can no longer be investigated, so we need to reflect on whether we positioned the problem too simply at the beginning, that is, the issue of authority, which affected our judgment and finally brought us to a dead end. Why don't we pull out and change our way of thinking and get rid of our stereotyped thinking!

First of all, PLS-00302/ORA-6500 is just an error report. Although we think it is a permission error, we will not consider it for the time being. For an error, we might as well do an errorstack trace of the process that caused the error:

SQL > oradebug setospid 3162340

Oraclepid: 55, Unix process pid: 3162340, image: oracle@XXXX (TNS V1-V3)

SQL > oradebug event 6550 trace name errorstack forever,level 12

Statement processed.

Then, get the trace file for errorstack that looks something like this:

However, things are not as smooth as we thought. Can you see the problem at a glance by giving you a trace file? Not really! Errorstack events seem to be set up by everyone, but whether you can read it or not is another matter. Like the errorstack setting above, obviously, it's just a processdump when you report an error. Dump has a lot of content, so it's hard to extract the information we need accurately from it! Especially when there is no direction of the problem.

Just leave the trace of errorstack for now. Maybe you can use it later.

Another way of thinking, the execution is the package under the sys user, after all, it is to call the SQL statement, so let's see what kind of SQL has been executed? We might as well do a 10046 event tracking!

SQL > oradebug setospid 4657324

Oracle pid:67, Unix process pid: 4657324, image: oracle@XXXX (TNS V1-V3)

SQL > oradebug event 10046 trace name context forever,level 12

Statement processed.

After setting the event, the trace file captured by sys.dbms_output.disable; is called on the appropriate session like this:

Yes, that's it, not much at all. There is a wait event "library cache lock". By checking that the object related to the wait is the dbms_standard package, this package is also owned by the sys user, and there is nothing special about it. There is also a problem when called under the iptXXX user.

Why is there so little trace crawled out in 10046? it seems that there is a mistake as soon as it is executed, and there is no chance of implementation at all.

In this way, it seems that I want to jump out of the dead end of authority, however, I have not found the direction of the problem!

Is there any other good way?

Try another way of thinking

At this point of the problem, it seems that there is no way to look into it. The search on MOS may be because the problem is not clear, so we can only use words such as PLS-00302/ORA-6500 to search. There are no helpful articles / attempts allowed in various production environments during bug;, and there seems to be no progress.

It seems that there is no better way, so we need to reflect: is there something wrong with the method we just did?

Can the trace grabbed by 1.errorstack help or not?

A: obviously there should be, but it is difficult to read. If we are not so familiar with its trace structure, then we should not stay here for a long time.

2.10046 can you catch the SQL it calls for such a statement?

A: no, as far as I can remember, 10046 can catch recursive calls to sql, but there is not a single SQL here, which seems unreasonable.

It seems that going back to the 10046 event is a good choice, and we confirm that when using 10046 to trace sys.dbms_transaction.local_transaction, there are many recursive SQL calls, and reading the sql one by one doesn't seem to get the information you want; however, here, it gives me a little inspiration: contrast! Yes, the previous verification problem has always been because in the production environment, there is no environment for random comparison and testing, resulting in a dead eye on the problem itself, lack of comparison; open up new ideas, how can we carry on the comparison?

For other users, there is no problem with calling the sys user's stored procedure, so let's take a look at what is done during the correct call? It's not so bad that you don't implement a SQL! Once again, do 10046 tracking for other users (let's say TEST users), and then successfully call the stored procedure under the sys user package. Sure enough, you can catch some SQL, but fortunately, there are not many, and the first thing that catches your eye is this SQL:

We might as well try to read it one by one and compare it one by one. Here is a sentence with three parameters, 49. Check in the production environment.

SQL > select obj#,type#,ctime,mtime,stime,status,dataobj#,flags,oid$, spare1, spare2 from obj$ where owner#=49

And name='SYS' and namespace=1 and remoteowner is null and linkname is null and subname is null

No rows selected

So, how do you compare? Obviously, 49 is the user ID (TEST user) from whom I call the stored procedure. What if it is executed by the iptXXX user?

SQL > select user_id from dba_users where username='IPTXXX'

USER_ID

thirty-four

SQL > select obj#,type#,ctime,mtime,stime,status,dataobj#,flags,oid$, spare1, spare2from obj$ where owner#=34 and name='SYS'and namespace=1 and remoteowner isnull and linkname isnull and subname isnull

OBJ# TYPE# CTIME MTIME STIME STATUS

DATAOBJ# FLAGS OID$ SPARE1 SPARE2

77574 2 19-JUN-09 07-JUL-17 19-JUN-09 1

77574 0 6 1

In fact, if you are an experienced DBA, you should be able to see where the problem lies; with one more step, everyone will suddenly be enlightened:

SQL > Select object_name,object_type,owner from dba_objects where object_id=77574

OBJECT_NAME OBJECT_TYPE OWNER

SYS TABLE IPTXXX

Yes, that's the way it is. After looking into the problem for most of the day, I feel that my IQ has been insulted by 10,000 points! I do not know there is a "XXX" I do not know when to speak improperly, can not help feeling, this is what kind of application design, what kind of database design ah? It turns out that under IPTXXX users, there is a watch named SYS! It turns out that the truth is excruciatingly simple!

Reflect! Reflect! Reflect!

After the problem is found out, it is often simple to look at it. If you only see this result, you may not gain much. In this regard, after I finished looking into this problem, it took me nearly a day to reflect deeply on the process, ideas and problems encountered in the process of dealing with this problem. I realized that this is definitely a case worth sharing.

Reflection 1:

Let's take a look at the whole process of the problem; there is a SYS object on the user IPTXXX, which causes a similar insufficient permission error when calling the stored procedure of the package under the SYS user; now, the impact on us is that PLSQL Developer cannot execute the query properly. Well, in other more complex business environments, for example, there are calls like PLSQL Developer in some black box application modules, and during the development process, the secondary development based on these black box modules and then adding the SYS table in the SCHEMA is enough to make the whole application unable to run normally. Generally speaking, as a DBA, we would think that ORACLE will retain keywords well to avoid being misused. However, from this CASE point of view, there is no (the reserved keywords of ORACLE can be found in V$RESERVED_WORDS). Then we need to give a reminder to the vast number of application designers / developers to avoid using keywords like SYS in the design process of objects! At the same time, for application maintenance / DBA, we need to check this situation before launch; for all kinds of SQL audit / application audit tool developers, we need to add this check to our tools.

Reflection 2:

Is it really that complicated to deal with this problem? I've been wrestling with permissions for so long:

We can obviously see that for the DBMS_LOCK package, when we do not have the permission to call, the error thrown by the database is PLS-00201, and the error we encounter is actually PLS-00302. In fact, in the face of unfamiliar error reports, if you want to confirm the meaning of the error code, you can first compare and confirm it in the form of simulated error reports.

Reflection 3:

Do we really have nothing to read about errorstack? What should we pay attention to the next time we encounter an operation that requires misjudgment? Let's review the error function thrown by errorstack:

If we carefully check the meaning of each function, we can see that the meaning function of kkxsem is actually KKX SEMantic phase, that is, semantic error; after confirming this function, we can confirm that the problem is not a permission problem to avoid misunderstanding! We can also pay more attention to callstack, which will be of great help in dealing with problems.

Reflection 4:

Is it true that 10046 events cannot track all SQL statements?

This is not the case, the 10046 event can trace all executed SQL statements! But for this case, the execution of the exec sys.dbms_output.disable () statement itself has a semantic error, and we are trying to track the recursive SQL generated during the execution of the statement Here, we can test that, in the process of parsing the "exec sys.dbms_output.disable ()" statement for the first time, we will need to load some data into library cache, this process needs to execute recursive SQL to implement, and after many executions, the relevant cursor of "exec sys.dbms_output.disable ()" has been cached in library cache lock, and recursive SQL will not be executed again, so we have not been able to trace it.

Reflection 5: does this count as a bug?

Can we assume that ORACLE forgets to use SYS as a reserved word and treats this problem as a bug? We don't draw any conclusions here.

Summary

For this CASE, if you do not understand the great harm of its conclusions, please read the "reflection one" section carefully. Anyway, after Lao K has finished his reflection, the first thing to do is to give advice to our customers to check whether all systems have such a similar table design.

From the point of view of the whole CASE, the final conclusion seems very simple, but we can understand that a bad application design is enough to make the application on the whole database unavailable. Therefore, a good design is the foundation for the normal operation of an application system.

In addition, for any CASE, we should not only get simple analysis results, the reflection and analysis of the analysis process is often more meaningful than the simple results; this is where we share this problem!

This article is reproduced in Zhongyi Antu.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.