-
Notifications
You must be signed in to change notification settings - Fork 623
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this. Changes are requested.
One potential problem with this PR is that we are switching catalog objects to be called 'entries', but then we are using object identifiers (oid_t
) to reference them.
src/binder/binder_context.cpp
Outdated
@@ -38,8 +38,10 @@ void BinderContext::AddRegularTable(const std::string db_name, | |||
const std::string table_alias, | |||
concurrency::TransactionContext *txn) { | |||
// using catalog object to retrieve meta-data | |||
auto table_object = catalog::Catalog::GetInstance()->GetTableObject( | |||
db_name, schema_name, table_name, txn); | |||
auto table_object = catalog::Catalog::GetInstance()->GetTableObject(txn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably rename this as GetTableEntry
too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this and decided to leave this as is for the following reason. A "Table" is a CatalogEntry inside the TableCatalog, and we are getting a "Table" outside of it, which makes sense. Naming this GetTableObject instead of GetTable makes this clear that we are getting a bunch of information we have on the table (the table "object" in a system), and not the contents of the table itself. (This would make even more sense if we have a glossary somewhere explaining this and use it as a naming convention.) "GetTableEntry" has the same confusing double meaning, and "GetTableCatalogEntry" is not ambiguous but long and doesn't make much sense without looking at the return type name. So the naming here is fine, but the typename needs to renamed because the type "TableObject" wouldn't make sense on its own.
Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should vote on this. I think Get*CatalogEntry
would be best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote Get*CatalogEntry
. I think the accuracy of function name is more important than length. And I think this name is not so long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go with apavlo, ksaito7 and vote for ... Entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine, I'll rename them to GetxxxCatalogEntry
Catalog::GetInstance()->CreateTable( | ||
catalog_database_name, catalog_schema_name, catalog_table_name, | ||
std::unique_ptr<catalog::Schema>(catalog_table_schema), txn, true); | ||
Catalog::GetInstance()->CreateTable(txn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless this function also creates the DataTable
object (which it shouldn't), maybe we should rename this to CreateTableEntry
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Sorry I missed this. I will rename this to TableObject or TableEntry depending on what you think makes sense for the above comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ironically, I think it does create the DataTable object (Not that it should)
storage::Database *pg_catalog = nullptr, | ||
type::AbstractPool *pool = nullptr, | ||
concurrency::TransactionContext *txn = nullptr); | ||
static DatabaseCatalog *GetInstance(concurrency::TransactionContext *txn = nullptr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the default values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried but CLion's refactor feature really doesn't like our code and couldn't handle it. We can add another issue so somebody who has time to go through these by hand can do so in the future. (I will add this under #1398 later)
src/catalog/catalog.cpp
Outdated
|
||
// Insert peloton database into pg_database | ||
DatabaseCatalog::GetInstance()->InsertDatabase( | ||
CATALOG_DATABASE_OID, CATALOG_DATABASE_NAME, pool_.get(), txn); | ||
DatabaseCatalog::GetInstance(nullptr, nullptr, nullptr)->InsertDatabase(txn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why pass a null txn pointer to DatabaseCatalog::GetInstance()
when you actually have the txn pointer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, that CLion refactor thing I was talking about.
src/catalog/catalog.cpp
Outdated
if (txn == nullptr) | ||
throw CatalogException("Do not have transaction to drop schema " + | ||
schema_name); | ||
|
||
auto database_object = | ||
DatabaseCatalog::GetInstance()->GetDatabaseObject(database_name, txn); | ||
DatabaseCatalog::GetInstance(nullptr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. You actually have the txn pointer.
src/catalog/catalog.cpp
Outdated
const type::TypeId return_type, oid_t prolang, const std::string &func_src, | ||
std::shared_ptr<peloton::codegen::CodeContext> code_context, | ||
concurrency::TransactionContext *txn) { | ||
void Catalog::AddPlpgsqlFunction(concurrency::TransactionContext *txn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this should be called AddPlpgsqlFunction
. You are passing in the prolang
argument, so it should just be called AddFunction
, right? Furthermore, we are are referring to UDFs as procedures (i.e., pg_proc
table), so it really should be called AddProcedure
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had no idea what this is supposed to do. Will change.
@@ -1188,14 +1411,16 @@ void Catalog::InitializeLanguages() { | |||
auto &txn_manager = concurrency::TransactionManagerFactory::GetInstance(); | |||
auto txn = txn_manager.BeginTransaction(); | |||
// add "internal" language | |||
if (!LanguageCatalog::GetInstance().InsertLanguage("internal", pool_.get(), | |||
txn)) { | |||
if (!LanguageCatalog::GetInstance().InsertLanguage(txn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not your problem, but we should not be initializing the language table in this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make an issue for this so that we don't forget it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. #1421
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, let me say that this looks good. The changes in formatting help readability significantly.
Have added comments, but the things I feel could use some additional attention (mostly pre-existing):
-
Exceptions vs. PELOTON_ASSERT. Haven't analyzed the code, but it looks to me as if there are quite a few locations where they should be asserts. We are trying to enforce an internal requirement, it isn't a recoverable run-time error.
-
LOG_DEBUG. Unnecessary LOG_DEBUG where it should probably be LOG_TRACE. We should have a discussion about debug logging sometime, because the noise level, when one turns on tracing, makes it almost useless. To improve that situation, I think LOG_DEBUG should be used sparingly.
-
Use of the Proc abbreviation in function / class names. I think this should be more explicitly Procedure. While the it is mostly clear that it is Procedure and not Process, we should just be explicit.
AbstractCatalog::GetResultWithSeqScan( | ||
concurrency::TransactionContext *txn, | ||
expression::AbstractExpression *predicate, | ||
std::vector<oid_t> column_offsets) { | ||
if (txn == nullptr) throw CatalogException("Scan table requires transaction"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CatalogException vs. PELOTON_ASSERT.
Should this be an assert? Is it ever legitimate to call this function without a transaction? If not, it should be an ASSERT.
oid_t index_offset, | ||
std::vector<type::Value> scan_values, | ||
std::vector<oid_t> update_columns, | ||
std::vector<type::Value> update_values) { | ||
if (txn == nullptr) throw CatalogException("Scan table requires transaction"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto for exception vs. PELOTON_ASSERT comment above.
concurrency::TransactionContext *txn) { | ||
ResultType Catalog::CreateSchema(concurrency::TransactionContext *txn, | ||
const std::string &database_name, | ||
const std::string &schema_name) { | ||
if (txn == nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exception vs. PELOTON_ASSERT
index_name, | ||
{column_id}, | ||
true, | ||
IndexType::BWTREE); | ||
LOG_DEBUG("Added a UNIQUE index on %s in %s.", col_name.c_str(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOG_DEBUG -> LOG_TRACE?
src/catalog/catalog.cpp
Outdated
// Check if UDF already exists | ||
auto proc_catalog_obj = | ||
ProcCatalog::GetInstance().GetProcByName(name, argument_types, txn); | ||
ProcCatalog::GetInstance().GetProcByName(txn, name, argument_types); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be clearer and more consistent to not use Proc. So rename to Procedure
ProcedureCatalog
GetProcedureByName
InsertProcedure
etc.
The local variables IMO can stay as is, the class names and class methods though, should change.
void SystemCatalogs::Bootstrap(const std::string &database_name, | ||
concurrency::TransactionContext *txn) { | ||
void SystemCatalogs::Bootstrap(concurrency::TransactionContext *txn, | ||
const std::string &database_name) { | ||
LOG_DEBUG("Bootstrapping database: %s", database_name.c_str()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOG_TRACE?
LOG_DEBUG("Bootstrapping database: %s", database_name.c_str()); | ||
|
||
if (!pg_trigger_) { | ||
pg_trigger_ = new TriggerCatalog(database_name, txn); | ||
pg_trigger_ = new TriggerCatalog(txn, database_name); | ||
} | ||
|
||
// if (!pg_proc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is dead code, remove?
IndexConstraintType index_constraint); | ||
|
||
//===--------------------------------------------------------------------===// | ||
// Members | ||
//===--------------------------------------------------------------------===// | ||
|
||
// Maximum column name size for catalog schemas | ||
static const size_t max_name_size = 64; | ||
static const size_t max_name_size_ = 64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are going to change it ... lets replace with a more descriptive name. e.g. max_column_name_size_
max_name_size is very generic, could be any name.
std::unique_ptr<LanguageCatalogObject> GetLanguageByName( | ||
const std::string &lang_name, concurrency::TransactionContext *txn) const; | ||
std::unique_ptr<LanguageCatalogEntry> GetLanguageByName(concurrency::TransactionContext *txn, | ||
const std::string &lang_name) const; | ||
|
||
enum ColumnId { | ||
OID = 0, | ||
LANNAME = 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Poor abbreviation. Should be LANG or LANGUAGE
src/binder/binder_context.cpp
Outdated
@@ -38,8 +38,10 @@ void BinderContext::AddRegularTable(const std::string db_name, | |||
const std::string table_alias, | |||
concurrency::TransactionContext *txn) { | |||
// using catalog object to retrieve meta-data | |||
auto table_object = catalog::Catalog::GetInstance()->GetTableObject( | |||
db_name, schema_name, table_name, txn); | |||
auto table_object = catalog::Catalog::GetInstance()->GetTableObject(txn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd go with apavlo, ksaito7 and vote for ... Entry.
…u-catalog-cleanup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per conversation with Tian Yu, naming changes done, exception/assert and logging added to follow on issue.
* Catalog code cleanup * Rename "XXXObject" to "CatalogEntry" * Rename AddPlpgsqlFunction
Addresses issue #1398.
This PR fixes 1, 2 and renames CatalogObjects to Entries. Other minor code style fixes are included as well.